It's the same on whisper for me
The biggest problem I'm having when it comes to transcription, even when using WhisperX, is the transcribed text having a full sentence, but the audio having the first or last word or two cut…
I don't know if it's related, but it's now 4 times "slower" for each preset (for instance, ultra used to set 16 samples that amounted to 4 autoregressive steps, now it sets 16 samples and 16…
Unless I'm mistaken you did not implement the same way, you set -sub where he set sub
return text_logits[:, :sub]
vs
return text_logits[:, :-sub]
Is this intended ?
I still don't…
Where do you place the models , I am a bit confused about where finetuned models go and where just models go
Place your custom models here './training/finetunes/{voice}.pth'
That line tries to load your training yaml file.
I assume you're launching train.bat
You have to pass the yaml file to train.bat
like this
train.bat "./training/yourvoice/train.yaml"
I'm still getting attempts at some outside connections
Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
handle: <Handle _ProactorBasePipeTransport._call_connec…
Hi I had the same error.
copy .\dlas\bitsandbytes_windows\* .\venv\Lib\site-packages\bitsandbytes\. /Y
didn't overwrite for me, I still had to copy manually.
Now it's training fine on…
I had better results by regenerating voice latents after switching the model to the trained one.
Are voice latents tied to the model maybe ?
Same result as you here. This particular issue seems fixed but I still can't get it to run on 8GB VRAM, always OOM. Maybe the model is too big to be trained on 8GB.
I set 50 iterations just to try to get it going on my GPU.
I ended up training on tts-fast colab and stopped it at 600 this is the yaml that colab created https://pastebin.com/E9wNJYmt
then…
Welp, still the same for me. Emptied the result directory for this voice, generated 3 times, and ended up with only files 00000 and 00001, with second attempt overwritten. Only thing suspicious…