Is it necessary to change the from English_cleaners to Basic_cleaners when training non-english languages?
#190
Open
opened
Loading…
Reference in New Issue
There is no content yet.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. It CANNOT be undone. Continue?
Hello,
I am training a hindi model, for that i created custom tokenizer and started the training. Now i found out this advice from a guy at this 152334H/DL-Art-School repo.
He said to change the cleaner from english to basic while training new languages:
Now, im confused, This is the code inside tokenizer.py where mrq has modified for japanese language and he also said to use japanese.json during synthesis of japanese voice.
But nowhere he mentioned about the "basic_cleaners" and to use them while training a new model in a non-english alnguage.
So, i think i also need to modify the tokenizer.py with something like this:
I have generated the the above code block for hindi language from chatgpt.
So, is this code block ok to use? or can i proceed without any modifications?
Can we use english_cleaners also to train non english languages? Or the modification is necessary?
Is it necessary to change the from English_cleaner to Basic_clear when training other languages?to Is it necessary to change the from English_cleaners to Basic_cleaners when training other languages?Is it necessary to change the from English_cleaners to Basic_cleaners when training other languages?to Is it necessary to change the from English_cleaners to Basic_cleaners when training non-english languages?See the proviso in
modules/dlas/dlas/models/audio/tts/tacotron2/text/cleaners.py
: