Did you successfully get a transcript from it? If not, do so, so you have a baseline for the output to expect. If so than simply replace primarySpeaker with your desired speaker when its…
Did you successfully get a transcript from it? If not, do so, so you have a baseline for the output to expect. If so than simply replace primarySpeaker with your desired speaker when its looping…
Long story short, I seem to have an issue with diarization not working that i need to sort out first
I've been considering releasing a full dataset preparation pipeline for…
Long story short, I seem to have an issue with diarization not working that i need to sort out first
I've been considering releasing a full dataset preparation pipeline for tortoise but for…
It's supported by whisperx, see notes on Speaker Diarization in the README.
Thanks, but as far I can…
Okay I'll take a crack at helping even though I'm a beginner too. I assume you're on Windows? It might be the encoding of your input files. Is it ANSI?
thanks that seems to have worked okay.
The latents are generated from the wav files in the subdirectory for that voice, the more you have in there the longer it will take to generate. You can adjust the
Sample Batch Size
in…
Honestly I wing it using the recommended settings more or less and get good results. I've found no considerable difference between 1000 steps and 2500, seems to be an efficient training method.…
Key takeaways: Prep your dataset as much as possible Better to train on less but the best than more of the worse Use your training samples to target the rough speed and…
A large chunk of memory is already being used or set aside or what I'm not sure.
Enable
Do Not Load TTS On Startup
and restart.
A large chunk of memory is already being…
Hmm. The only thing that sticks out to me is that the audio is mono. I don't see any reason why that should be a problem but all my samples are stereo. Can you try and reproduce the fault with…
There's probably some way to tinker with your current install to get it working but I think the most efficient thing to do is wipe it (save your datasets, of course), reclone the repo, and run…
Can you run
ffprobe
on the clips and post the output?
ffprobe third_00000.wav
ffprobe version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2007-2021 the FFmpeg developers
built with gcc 11…
Im still searching for how to include this authorization token the error asks for assuming it actually is the problem. There doesn't seem to be an obvious way.
It's [in the Wiki](https:/…
After activating the venv does
pip list installed
show whisperx?
Okay we're getting somewhere. Whisperx wasn't on the list despite being able to be called under the environment. I…
I adjusted the batch size down to 4 as suggested when validating the training file and even down to 2 with the gradient accumulation size at 1 but it still gives the same error. I shrank down…