Are you trying to create a tokenizer that handles devanagari or are you romanizing words is kadar?
This process would be a lot of work for one person. Unless there is a superior method.
Almost all of what you've proposed above can be done automatically. whisperx
produces millisecond-granu…
If you check the training/\<voice name\>/finetune/models/
directory is there a 800_gpt.pth
file there? Can you use it to generate samples?
I've never used colab notebook; you might want to check out the issue tracker on whisperx's repo and see if someone over there knows.
You can use --model_dir
to point it at the location where your model files are stored.
I've never been able to get whispercpp working, so no idea there. For whisperx I would try installing it separately using the instructions in the repo: https://github.com/m-bain/whisperX
Hmm. I see the first time it took extremely long because it had to generate the latents for that voice and model, but that doesn't explain why it took so long the second time. Try changing "Sample…
Post your console log.
I would need to do wav2vec2 alignment on the transcription text, and do my own segmenting to pair down the audio. I suppose it's simple, as I can maybe subjugate whisperX to do just that, but I…
Forgive me for butting in, but howcome you haven't worked on building a more varied dataset then? There's hundreds of hours of video game dialogue & podcasts available for you to build a more…
That's the right one. You can try git restore models/.template.dlas.yaml
but it's odd that it isn't there already. Have you run update.bat?
I don't have a mic so I can't check but you might need to adjust your microphone settings so that the recording sample rate is 22050.
Collecting numba Using cached numba-0.56.4.tar.gz (2.4 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info…