Weird sound after most lines #325
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#325
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I've trained multiple models that are capable of producing pretty good output.
However, with either model, they both make a weird sound like they're having a stroke at the end of most lines (or at the beginning of a new line, it's hard to distinguish). It's like a moaning/groaning/robot stroke.
Other than that the output is basically fantastic. What is causing this?
I don't quite recall hearing that at the beginning of lines. I'm very confident it's an issue at the end of lines, and when everything is getting stitched together, it just seems like it's for the beginning of a new line.
From what I remember with this previous issue:
Length Penalty
to try and wrangle it in. I honestly don't recall if that ever made a difference, but I suppose if it didn't work it wouldn't be a feature in base TorToiSe.It seems to have been a bug that's cleared after rebooting. I was not receiving the errors you described.
One other question, as this project for me is related to producing audiobooks from ebooks:
Currently, this sort of output is typical:
But I'm certain the first few times I generated some content, it didn't have a bunch of loading steps in between each line. Is there a way to fix that? All that loading accounts for 90% of the time it takes to generate. The sentences themselves take like 2-6 seconds each (might even be faster if it could use both 4090s).
mmm, I did some cursory glances and added a bit more aggressive checks to ensure that the model doesn't get reloaded on the TorToiSe side in commit
9afa7154
. Agit pull origin master
on.\ai-voice-cloning\modules\tortoise-tts\
should definitely update it, since I'm not too sure if the update script would update it, as I might have to bump up the submodule commit hash for the AIVC repo.I'm not too sure what exactly was causing the problem, but I'll just write it off as "I was being very stupid with assuming a path string would be the same at all times" and instead just opted to rely on
os.path.samefile
to determine if two path strings point to the same file. It seems to work now, at least on whatever's left of my testing environment on Windows for TorToiSe.What seemed to work for me was reset all parameters of generation to default, and progressively retweak them to reach my previous configuration. We’ll see whether it’s a long term solution!
EDIT : a month later, this technique seems to always work at one point. Also, I've noticed that if you put an empty line between two lines instead of beginning your new line right under the previous one, you'll have such artefacts.
For instance if you do :
"The first line
The second line"
Instead of
"The first line
The second line"
You'll likely have artefacts if you use the default /n delimiter
EDIT2 : This also means that if you have a space or a line break after one of your lines in your prompt, it will generate an artefact. For instance :
"This is a first line"
You must make sure that there's no line break after that line, or space.