Weird groaning at the end of every output file... #226

Closed
opened 2023-05-02 09:05:35 +00:00 by nirurin · 3 comments

http://sndup.net/y8w3

So I've had this issue since the early days of tortois/Mrq and have never figured out what causes it. It happens a lot. I've trained this dataset in this example for the recommended settings in the wiki (50 epochs, [2,4,9,18 etc] 0.001) and it does it, and it does it if I train for 500 epochs, and it does it if I train for 500 with different settings...

I've had other voices that I (somehow) managed to get it to (mostly) not produce weird extended groaning endings, but no idea how as I don't do anything different.

The weird part is it's not garbling the input. It is saying the full input. Then it pauses, and then it screams out something from a japanese horror movie.

I don't know if I'm training for too long, or not long enough, or what. Its frustrating.

http://sndup.net/y8w3 So I've had this issue since the early days of tortois/Mrq and have never figured out what causes it. It happens a lot. I've trained this dataset in this example for the recommended settings in the wiki (50 epochs, [2,4,9,18 etc] 0.001) and it does it, and it does it if I train for 500 epochs, and it does it if I train for 500 with different settings... I've had other voices that I (somehow) managed to get it to (mostly) not produce weird extended groaning endings, but no idea how as I don't do anything different. The weird part is it's not garbling the input. It is saying the full input. Then it pauses, and then it screams out something from a japanese horror movie. I don't know if I'm training for too long, or not long enough, or what. Its frustrating.

I have a similar issue:
every generated clip is the same length of 23 seconds, where it reads the input alright and then tries to repeat the phrase with some garbled sounds.

I have a similar issue: every generated clip is the same length of 23 seconds, where it reads the input alright and then tries to repeat the phrase with some garbled sounds.

Hey @nirurin I am having the same problem. Were you able to find a solution?

Hey @nirurin I am having the same problem. Were you able to find a solution?
Owner

mmm, I'm pretty sure TorToiSe would raise a warning message about the input/output being "too long" due to there not being any stop tokens generated, so it won't know when to "stop" after some threshold.

If I remember right, you might be able to play around with the Length penalty and Repetition penalty sliders in the advanced generation options to curb it, but typically that would mean you're exceeding how much the model can output. Whether finetuning will modify this window, I'm not exactly sure.

mmm, I'm pretty sure TorToiSe would raise a warning message about the input/output being "too long" due to there not being any stop tokens generated, so it won't know when to "stop" after some threshold. If I remember right, you might be able to play around with the `Length penalty` and `Repetition penalty` sliders in the advanced generation options to curb it, but typically that would mean you're exceeding how much the model can output. Whether finetuning will modify this window, I'm not exactly sure.
Sign in to join this conversation.
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#226
No description provided.