Weird groaning at the end of every output file... #226
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#226
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
http://sndup.net/y8w3
So I've had this issue since the early days of tortois/Mrq and have never figured out what causes it. It happens a lot. I've trained this dataset in this example for the recommended settings in the wiki (50 epochs, [2,4,9,18 etc] 0.001) and it does it, and it does it if I train for 500 epochs, and it does it if I train for 500 with different settings...
I've had other voices that I (somehow) managed to get it to (mostly) not produce weird extended groaning endings, but no idea how as I don't do anything different.
The weird part is it's not garbling the input. It is saying the full input. Then it pauses, and then it screams out something from a japanese horror movie.
I don't know if I'm training for too long, or not long enough, or what. Its frustrating.
I have a similar issue:
every generated clip is the same length of 23 seconds, where it reads the input alright and then tries to repeat the phrase with some garbled sounds.
Hey @nirurin I am having the same problem. Were you able to find a solution?
mmm, I'm pretty sure TorToiSe would raise a warning message about the input/output being "too long" due to there not being any stop tokens generated, so it won't know when to "stop" after some threshold.
If I remember right, you might be able to play around with the
Length penalty
andRepetition penalty
sliders in the advanced generation options to curb it, but typically that would mean you're exceeding how much the model can output. Whether finetuning will modify this window, I'm not exactly sure.