IT just feels like its too fast, and so shouldn't be any good haha. No actual evidence to back that up!
This is my current latest graph, which is more like what (in my warped mind) I would…
Maybe 50 epochs is enough?..
Hard to say without knowing your batch size and how many steps per epoch you have.
In this case its a small dataset, 100 files, so batches of 100. 1…
desu the first finetune test has a much smaller size (dataset size of 4.5k for 11 epochs). Granted, all of the hyperparameters play a role in…
- batch one didn't trim clips that exceeded 11.6s (dataset size of ~8k, for ~15 epochs)
Only 15 epochs? Is this a typo? I've been doing 200-1500 for most of my training, and that's just for…
You can disable the
Delete Non-Final Outputssetting under Settings to retain the individual pieces that get combined.
~Ahhh I see, I hadn't noticed that one, thanks
As an aside, but its part of my ongoing journey to clean up my training -
My outputs are now generating some fairly decent speech, even with very small data sets. However the output audio…
https://github.com/openai/whisper/discussions/435
This seems to be the most recent discussion involving a fix for the innaccurate timestamps in whisper.
Otherwise, as you suggest, I may…
Pushed commit 2424c455cb9614003c072f6cdc25fa80ba2694ba. It seems every passing day I regret more and more adding whisperx.
~~I'm very, very tempted to just remove it. It caused nothing…
So I'm trying again, with a fresh training session, but now I seem to be getting the groany/garbled generated voices on both mrq (using manual voice chunks) AND in fast-tts lol. So this time I…
Yeh I agree, though that's why I mentioned it would be nice if this was able to be automated... as I'll have to manually remove ~150 entries from the text file lol.
Just run the…
I can't imagine 0s files being anything other than poorly cut off. If you have enough data then I'd drop the worst part of it.
Yeh I agree, though that's why I mentioned it would be nice if…
I do think it could only be an improvement if there was an automated way to remove any transcribed clips that are below a certain length, as most of those are half-words or weirdly cut off. Not…
oh most of those files seem to be in the 'validation.txt' not in train.txt. Not sure what that file does.
Oh, no, that file is in the voices/patrick folder. In the training/patrick/audio folder its been cut up by whisper into a bunch of short files.
Ah, then it shouldn't affect it, as the…