Set Whisper default to Base-EN #59

Closed
opened 2023-03-04 01:27:34 +00:00 by deviandice · 3 comments

The .en models for English-only applications tend to perform better, especially for the tiny.en and base.en models. We observed that the difference becomes less significant for the small.en and medium.en models.

https://github.com/openai/whisper#:~:text=The%20.en%20models%20for%20English%2Donly%20applications%20tend%20to%20perform%20better%2C%20especially%20for%20the%20tiny.en%20and%20base.en%20models.%20We%20observed%20that%20the%20difference%20becomes%20less%20significant%20for%20the%20small.en%20and%20medium.en%20models.

The en versions of whisper, such as the tiny-en, outperforms the normal base. I'd suggest because nobody would probably ever look, set the default to base-en. The majority of users are english so it's reasonable and they would probably enjoy better accuracy.

> The .en models for English-only applications tend to perform better, especially for the tiny.en and base.en models. We observed that the difference becomes less significant for the small.en and medium.en models. https://github.com/openai/whisper#:~:text=The%20.en%20models%20for%20English%2Donly%20applications%20tend%20to%20perform%20better%2C%20especially%20for%20the%20tiny.en%20and%20base.en%20models.%20We%20observed%20that%20the%20difference%20becomes%20less%20significant%20for%20the%20small.en%20and%20medium.en%20models. The en versions of whisper, such as the tiny-en, outperforms the normal base. I'd suggest because nobody would probably ever look, set the default to base-en. The majority of users are english so it's reasonable and they would probably enjoy better accuracy.
Owner

I'll compromise and have it automatically use the -en version if a non -en model is selected. desu, I think it's better to have it default to a more universal model over having a more "accurate" one, as if you're looking for better accuracy, you'd change the model anyways.

I'll compromise and have it automatically use the `-en` version if a non `-en` model is selected. desu, I think it's better to have it default to a more universal model over having a more "accurate" one, as if you're looking for better accuracy, you'd change the model anyways.
Author

Yeah that sounds like a good middle ground. It's only the english model that get's this benefit anyway.

Yeah that sounds like a good middle ground. It's only the english model that get's this benefit anyway.
Owner

Compromise implemented in commit 3e220ed306.

Default language is set to en (compatible with both implemented whisper implementations).

When a specialized model for a language is detected (so just ${model}.en), it'll load that whisper model instead.

To override this (load the universal model with English), just leave the language blank, as the universal model will automatically deduce the language this way, anyways.

Compromise implemented in commit 3e220ed306c619868fb4195afffd622a013d771d. Default language is set to `en` (compatible with both implemented whisper implementations). When a specialized model for a language is detected (so just `${model}.en`), it'll load that whisper model instead. To override this (load the universal model with English), just leave the language blank, as the universal model will automatically deduce the language this way, anyways.
mrq closed this issue 2023-03-05 05:24:31 +00:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#59
No description provided.