There's a relatively new TTS called Balacoon, aimed at low end devices. I tried it out on my desktop and it was faster than RT.
How was the quality?
A large chunk of memory is already being used or set aside or what I'm not sure.
Enable Do Not Load TTS On Startup
and restart.
As noted in issue #68 from the whisperx repo "with VAD" is now the default, so you could try changing utils.py to just call transcribe() and see if that works. I always prepare my datasets…
Hmm. The only thing that sticks out to me is that the audio is mono. I don't see any reason why that should be a problem but all my samples are stereo. Can you try and reproduce the fault with…
There's probably some way to tinker with your current install to get it working but I think the most efficient thing to do is wipe it (save your datasets, of course), reclone the repo, and run the…
Im still searching for how to include this authorization token the error asks for assuming it actually is the problem. There doesn't seem to be an obvious way.
It's [in the Wiki](https://git.e…
Isn't that just for finetuning a wav2vec2 model?
That's what they are, as far as I can tell.
Can you run ffprobe
on the clips and post the output?
After activating the venv does pip list installed
show whisperx?
Try activating the venv in the directory you cloned the repo into and then git submodule update –remote
I adjusted the batch size down to 4 as suggested when validating the training file and even down to 2 with the gradient accumulation size at 1 but it still gives the same error. I shrank down my…
Can you run whisperx from the command line? (Not though importing it in a python session, just from the prompt.)
https://github.com/facebookresearch/fairseq/blob/main/examples/mms
facebook released some models, not sure how to use it tho
It's right there in the [TTS](https://github.com/facebookres…
Try git submodule update --remote
Did you run the setup script?
Are you using a voice sample or random? If the former, what's the file size of the latents?
Do you Slimmer Computed Latents
turned on? If so that'll stop CVVP from working.
I don't think the IPA tokenizer would be required for Dutch. What's your loss graph look like?