`That is just really small, the ljspeech will have probably 5-10k clips (for cca 20hours of audio). Tortoise model is quite big so it would be quite prone to overfitting with small number of…
Yes, exactly. But it's not my tool. It's made by @mrq
I'm so sorry, I read the message too quick and I assumed it was mrq who had answered ^^'. My apologies
`Try to use really low lr rate…
Hey! Thank you, yes indeed I'm just trying to refine an english speaking model for a game character. Thank you for the finetuned model, if I understand correctly, James Becker took the feminine…
Hey, I have another question concerning this matter : can you point in which script the conversion to 22050 Hz happens? Which method is used to do so, and finally, does the conversion happen if…
Hello, I'm following up on this issue, because I'm still struggling with overfitting problems. Only, I've realized, they are not only occurring with large datasets, but also with smaller ones (200…
Hey MRQ, thank you for pinpointing this in the DLAS output file, I found it in the log indeed! I'm going to try that.
I'm not sure I understand why you don't think it's worth it though?…
Hey @epp thank you for your answer!
I don't see the seed parameter in the previous documents, can you help me pinpoint it?
Ah! I see! I'll try and dig into the code to find where it is used, see if there are some parameters or threshold I can tamper with... Thank you!
Noted, thank you :) I'm going to keep this thread open for now, if that's no trouble, in case I run into more questions. Have a great day!
EDIT : Solved
=> update conda and pip => install Ubuntu Build essentials (don't know if that really played a part in it but well...) => Re run setup_cuda.sh AND pip install -r requirements.txt…
What seemed to work for me was reset all parameters of generation to default, and progressively retweak them to reach my previous configuration. We’ll see whether it’s a long term solution!
Worked like a charm for me, thank you! (I'm on Python 3.10.13 though)
Mmmmh indeed, that's very weird that you have a different pitch but otherwise good results. From my experience, it happened that I had to use another batch of the same voice (but not…