psammites
  • Joined on 2023-03-11
psammites commented on issue mrq/ai-voice-cloning#212 2023-04-19 19:47:47 +00:00
Error when training : TypeError: new(): invalid data type 'str'

Can you post your train.txt?

psammites commented on issue mrq/ai-voice-cloning#211 2023-04-19 14:48:49 +00:00
"Saved" voice is wildly inconsistent

Did you reuse the same seed?

psammites commented on issue mrq/ai-voice-cloning#208 2023-04-17 10:30:22 +00:00
Validate Training Configuration Gets Stuck in a Loop

Add another sample so that your batch size is divisible by more numbers. Probably not the "correct" solution but the most expedient one.

psammites commented on issue mrq/ai-voice-cloning#197 2023-04-16 02:44:16 +00:00
Epochs, iterations, and datasets

You could try restarting with a higher learning rate for a lower number of iterations and see if it makes a difference.

psammites commented on issue mrq/ai-voice-cloning#198 2023-04-13 02:41:30 +00:00
Found some bad audio files during the middle of the training. What to do?

Ahh, I forgot to mention that, sorry. Appending -ar 22050 to your ffmpeg command should fix it.

psammites commented on issue mrq/ai-voice-cloning#202 2023-04-13 02:39:43 +00:00
How do I regenerate the same input prompt?

Are you using a random seed?

psammites commented on issue mrq/ai-voice-cloning#198 2023-04-12 12:15:58 +00:00
Found some bad audio files during the middle of the training. What to do?

I haven't run into that error before. Does running ffprobe on the wav files reveal any errors?

psammites commented on issue mrq/ai-voice-cloning#197 2023-04-11 21:01:41 +00:00
Epochs, iterations, and datasets

If your dataset has a thick accent you might need to check the transcriptions to make sure that they're accurate.

psammites commented on issue mrq/ai-voice-cloning#198 2023-04-11 18:59:42 +00:00
Found some bad audio files during the middle of the training. What to do?

The most important thing is that when you do the transcription with whisperx you specify --align_model WAV2VEC2_ASR_LARGE_LV60K_960H or else the timestamps are going to be inaccurate. I have…

psammites commented on issue mrq/ai-voice-cloning#198 2023-04-11 09:25:42 +00:00
Found some bad audio files during the middle of the training. What to do?

So, do you do it manually?

Semi-manually, I use whisperx to produce a timestamped transcription and then feed the timestamps into ffmpeg to cut things to size.

Also, do you recommend…

psammites commented on issue mrq/ai-voice-cloning#197 2023-04-11 03:45:14 +00:00
Epochs, iterations, and datasets

Or is it more iterations per epoch?

AIUI more iterations per epoch just means a smaller batch size.

Are multiple wavs required in the vocals folder, or just one good example?

Just…

psammites commented on issue mrq/ai-voice-cloning#198 2023-04-11 03:42:15 +00:00
Found some bad audio files during the middle of the training. What to do?

I'd restart the training with a clean dataset, just to be sure.

psammites commented on issue mrq/ai-voice-cloning#195 2023-04-09 03:31:04 +00:00
Having hardtime making Whispercpp and Whisperx work (COLAB)

so, how would you install? right in the main ai-voice-cloning directory?

I do pip install git+https://github.com/m-bain/whisperx.git in my home directory, but keep in mind I've never used…

psammites commented on issue mrq/ai-voice-cloning#195 2023-04-08 21:05:43 +00:00
Having hardtime making Whispercpp and Whisperx work (COLAB)

I wouldn't try cloning them in a subdirectory of ai-voice-cloning just in case the different venv's conflict.

psammites commented on issue mrq/ai-voice-cloning#194 2023-04-07 23:16:09 +00:00
is there a way to choose which cpu core/thread tortoise-tts uses?

You can select which GPU to use but CPU allocation is up to the OS, I think.

psammites commented on issue mrq/ai-voice-cloning#152 2023-04-07 23:15:11 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)

Can whisper run through a batch of single files with the same level of convenience?

Kind of. You can specify multiple files when you run it, ex: `whisperx --model large ---task transcribe…

psammites commented on issue mrq/ai-voice-cloning#190 2023-04-04 03:14:51 +00:00
Is it necessary to change the from English_cleaners to Basic_cleaners when training non-english languages?

Can we use english_cleaners also to train non english languages? Or the modification is necessary?

See the proviso in modules/dlas/dlas/models/audio/tts/tacotron2/text/cleaners.py:

"""…

psammites commented on issue mrq/ai-voice-cloning#189 2023-04-03 15:36:11 +00:00
whisper large model stops the script everytime i try to use it.

Haven't run into it. Can you run whisper outside the script, and if so does it do the same thing?

psammites commented on issue mrq/ai-voice-cloning#188 2023-04-02 10:09:10 +00:00
ModuleNotFoundError: No module named 'dlas.models.clip.model'

Did you run the appropriate setup script?

psammites commented on issue mrq/ai-voice-cloning#187 2023-04-02 04:48:36 +00:00
Recommendations for creating tokenizer.json for a specific language

You can check out models/tokenizers/japanese.json for an example of how to do it, however because Japanese rules for syllable construction are far more limited you've got your work cut out for…