I decided to move this out of the """blog""" update comment above, since it should be it's own section for me to continue updating my thoughts on:
It seems M$ has an answer to…
Thanks for throwing me a bone, it's very interesting comparing the validation output of the different models! Despite the accuracy being lower than model A, the audio quality of the validation…
Next time you start an Xorg session, can you post some example audio?The last audio from June 6th didn't have the ~3400+ hour-dataset, nor did it have vocos, and I'm curious as to how much of an…
There's a new neural vocoder that might be worth checking out called 'Vocos'. It was made for bark TTS, and sounds like an improvement to bare EnCodec. The demo doesn't compare it to any other…
I was catching up on the thread and wondering if there was a reason for not using the LibriLight dataset until I saw you mention
My only concern is if there's any overlap between it and…
If you still need more data, I'd recommend checking out the VoxCeleb dataset. It advertises over 7000 celebrity voices and over 2000 hours of audio, so it's a fairly large one. The dataset…