OK so I have a 2000 hour audiobook dataset compiled. Didn't take that long to gather but uploading it took forever. It's untranscribed still as well.
Use it if you feel like you're not making…
That actually looks encouraging. I'd give it some more time. Do you have a loss target in mind?
I do however wonder how it would fare if you gave it like 2000 hours worth of speech to train on…
Well I guess that's good news I guess, those metrics did look pretty bad
- I think any noticeable jumps in the training metrics when I feed the beast will require an astronomical amount of new data, as I'm only at ~532 hours compared to the original paper saying it…
I was just playing around with vast.ai, a GPU peer sharing service and my first impression is that it works really well. Used it with the paperspace URL and it seems pretty robust.
You can get…
OK nevermind it is actually producing pretty good output now, correctly pronouncing most of the words. I retrained on a 200 hour dataset of dutch audiobooks this night. The voice cloning doesn't…
4 epochs, I have added my train.yaml and tokenizer below. Currently also transcribing a 300h dataset to see if that helps.
Do you happen to know if the ipa tokenizer works for non english…