• Joined on 2023-05-17
Fresh12 commented on issue mrq/ai-voice-cloning#319 2023-08-02 09:18:55 +00:00
Transcription and diarization (speaker identification) - Easy dataset building?

Did you successfully get a transcript from it? If not, do so, so you have a baseline for the output to expect. If so than simply replace primarySpeaker with your desired speaker when its…

Fresh12 commented on issue mrq/ai-voice-cloning#319 2023-08-02 05:44:29 +00:00
Transcription and diarization (speaker identification) - Easy dataset building?

Did you successfully get a transcript from it? If not, do so, so you have a baseline for the output to expect. If so than simply replace primarySpeaker with your desired speaker when its looping…

Fresh12 commented on issue mrq/ai-voice-cloning#319 2023-08-01 17:53:38 +00:00
Transcription and diarization (speaker identification) - Easy dataset building?

Long story short, I seem to have an issue with diarization not working that i need to sort out first

I've been considering releasing a full dataset preparation pipeline for…

Fresh12 commented on issue mrq/ai-voice-cloning#319 2023-08-01 00:40:09 +00:00
Transcription and diarization (speaker identification) - Easy dataset building?

Long story short, I seem to have an issue with diarization not working that i need to sort out first

I've been considering releasing a full dataset preparation pipeline for tortoise but for…

Fresh12 commented on issue mrq/ai-voice-cloning#319 2023-07-31 02:24:09 +00:00
Transcription and diarization (speaker identification) - Easy dataset building?

It's supported by whisperx, see notes on Speaker Diarization in the README.

Thanks, but as far I can…

Fresh12 commented on issue mrq/ai-voice-cloning#317 2023-07-30 00:01:21 +00:00
Cant get Training running

Okay I'll take a crack at helping even though I'm a beginner too. I assume you're on Windows? It might be the encoding of your input files. Is it ANSI?

Fresh12 closed issue mrq/ai-voice-cloning#267 2023-07-23 02:57:23 +00:00
Are conditioning latents harder to generate for larger datasets?
Fresh12 commented on issue mrq/ai-voice-cloning#267 2023-07-23 02:57:18 +00:00
Are conditioning latents harder to generate for larger datasets?

thanks that seems to have worked okay.

Fresh12 commented on issue mrq/ai-voice-cloning#267 2023-06-23 04:50:15 +00:00
Are conditioning latents harder to generate for larger datasets?

The latents are generated from the wav files in the subdirectory for that voice, the more you have in there the longer it will take to generate. You can adjust the Sample Batch Size in…

Fresh12 opened issue mrq/ai-voice-cloning#267 2023-06-15 00:06:26 +00:00
Are conditioning latents harder to generate for larger datasets?
Fresh12 commented on issue mrq/ai-voice-cloning#253 2023-06-14 20:34:48 +00:00
Results, Retrospectives, and Recommendations

Honestly I wing it using the recommended settings more or less and get good results. I've found no considerable difference between 1000 steps and 2500, seems to be an efficient training method.…

Fresh12 commented on issue mrq/ai-voice-cloning#253 2023-06-08 10:14:02 +00:00
Results, Retrospectives, and Recommendations

Key takeaways: Prep your dataset as much as possible Better to train on less but the best than more of the worse Use your training samples to target the rough speed and…

Fresh12 closed issue mrq/ai-voice-cloning#244 2023-06-05 18:42:58 +00:00
Step by step data prep and training/finetuning guide?
Fresh12 commented on issue mrq/ai-voice-cloning#244 2023-06-05 18:42:58 +00:00
Step by step data prep and training/finetuning guide?

A large chunk of memory is already being used or set aside or what I'm not sure.

Enable Do Not Load TTS On Startup and restart.

A large chunk of memory is already being…

Fresh12 commented on issue mrq/ai-voice-cloning#244 2023-05-28 09:17:13 +00:00
Step by step data prep and training/finetuning guide?

Hmm. The only thing that sticks out to me is that the audio is mono. I don't see any reason why that should be a problem but all my samples are stereo. Can you try and reproduce the fault with…

Fresh12 commented on issue mrq/ai-voice-cloning#249 2023-05-25 08:54:32 +00:00
Out of memory errors and using whisperX

There's probably some way to tinker with your current install to get it working but I think the most efficient thing to do is wipe it (save your datasets, of course), reclone the repo, and run…

Fresh12 commented on issue mrq/ai-voice-cloning#244 2023-05-25 08:48:51 +00:00
Step by step data prep and training/finetuning guide?

Can you run ffprobe on the clips and post the output?

ffprobe third_00000.wav
ffprobe version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2007-2021 the FFmpeg developers
  built with gcc 11…
Fresh12 commented on issue mrq/ai-voice-cloning#249 2023-05-25 02:35:02 +00:00
Out of memory errors and using whisperX

Im still searching for how to include this authorization token the error asks for assuming it actually is the problem. There doesn't seem to be an obvious way.

It's [in the Wiki](https:/…

Fresh12 commented on issue mrq/ai-voice-cloning#249 2023-05-24 10:50:03 +00:00
Out of memory errors and using whisperX

After activating the venv does pip list installed show whisperx?

Okay we're getting somewhere. Whisperx wasn't on the list despite being able to be called under the environment. I…

Fresh12 commented on issue mrq/ai-voice-cloning#244 2023-05-24 01:21:04 +00:00
Step by step data prep and training/finetuning guide?

I adjusted the batch size down to 4 as suggested when validating the training file and even down to 2 with the gradient accumulation size at 1 but it still gives the same error. I shrank down…