psammites
  • Joined on 2023-03-11
psammites commented on issue mrq/ai-voice-cloning#160 2023-03-22 07:30:54 +00:00
Can't train a single good model

I tried redoing it with commit 0231550287 from about 2 weeks ago, and the output was much better; close to the dataset voice. The training ran much faster too.

Did redoing it include…

psammites commented on issue mrq/ai-voice-cloning#160 2023-03-21 17:58:06 +00:00
Can't train a single good model

Weird, that sounds like just about ideal. Are there any complications like reverb or background music?

psammites closed issue mrq/ai-voice-cloning#164 2023-03-21 16:47:49 +00:00
After updating to the latest commit training fails with "No module named 'dlas'"
psammites opened issue mrq/ai-voice-cloning#164 2023-03-21 16:45:59 +00:00
After updating to the latest commit training fails with "No module named 'dlas'"
psammites commented on issue mrq/ai-voice-cloning#160 2023-03-21 15:55:14 +00:00
Can't train a single good model

How big is your dataset size and how different is it from "standard" English speech?

psammites commented on issue mrq/ai-voice-cloning#152 2023-03-21 15:52:46 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)

I'll be comfortable with renting out a GPU to do bigger training on (or cave and buy a 4090, as the prospect of renting for pennies sounds worse than just splurging $1500 on another GPU).

[The…

psammites commented on issue mrq/ai-voice-cloning#163 2023-03-21 11:33:47 +00:00
Training randomly crashes after 358 epochs

How much free disk space is there?

psammites commented on issue mrq/ai-voice-cloning#160 2023-03-21 04:20:32 +00:00
Can't train a single good model

Have you tried training a model with a single voice, for comparison?

psammites commented on issue mrq/ai-voice-cloning#160 2023-03-20 21:52:11 +00:00
Can't train a single good model

How closely does the transcription in train.txt match the content of the audio clips?

psammites closed issue mrq/ai-voice-cloning#158 2023-03-20 16:27:13 +00:00
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
psammites opened issue mrq/ai-voice-cloning#158 2023-03-20 04:07:02 +00:00
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
psammites commented on issue mrq/ai-voice-cloning#147 2023-03-19 01:50:32 +00:00
Discussion about Fine Tuning on a different language.

the default pathway that ASCII-fies it

psammites commented on issue mrq/ai-voice-cloning#154 2023-03-19 01:48:16 +00:00
Random voice not working with CVVP?

Yes, what I meant was the error message is incorrect.

psammites commented on issue mrq/ai-voice-cloning#147 2023-03-19 00:27:57 +00:00
Discussion about Fine Tuning on a different language.

Trained a new model with the Japanese tokenizer, and after ~55 epochs (~825000 samples processed), I have a better Japanese model:

Was it with VALL-E or DLAS?


What effect does…

psammites opened issue mrq/ai-voice-cloning#154 2023-03-18 20:09:51 +00:00
Random voice not working with CVVP?
psammites commented on issue mrq/ai-voice-cloning#153 2023-03-18 15:16:33 +00:00
Using more CUDA

Are you running under Windows or Linux, and how many GPU's?

psammites commented on issue mrq/ai-voice-cloning#152 2023-03-18 06:35:26 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)
  • I'm starting to hit the limitations of finetuning the base TorToiSe model.
  • for non-English, a replaced tokenizer vocab is practically required for accuracy, and I have had terrible luck…
psammites commented on issue mrq/ai-voice-cloning#152 2023-03-18 05:06:10 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)

no BitsAndBytes to save my hide, so it's quite the VRAM hog.

How bad is it? Is it still something that could run on HEDT graphics cards or should I be pricing out refab P40's on eBay?

psammites commented on issue mrq/ai-voice-cloning#151 2023-03-17 23:37:50 +00:00
Is Cos.Annealing ever a better option?

My gut feeling is that you'd want at least 100-200 epochs, but if your training set is close to a "standard" US English accent then you should be able to get away with less. How's the quality…

psammites commented on issue mrq/ai-voice-cloning#151 2023-03-17 23:11:43 +00:00
Is Cos.Annealing ever a better option?

Maybe 50 epochs is enough?..

Hard to say without knowing your batch size and how many steps per epoch you have.