Is Cos.Annealing ever a better option? #151
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#151
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I often find multistep either learns too quickly (the graphs drops fast) or only gets as far as about 1.2 before plateauing (and probably not getting down to ~0.5 on the graph until like epoch 20,000 or something.
Though I'm still not sure what a good graph should be. It feels like a gradual curve down to 0.5ish over 1000-2000 epochs should give strong results, but either the curve drops down within about 100 epochs, or it flattens out and never drops lol.
Cos. Annealing tests have let me ... kind of make this curve, but the results so far haven't been stellar.
Maybe 50 epochs is enough?..
LR = 0.00005
Hard to say without knowing your batch size and how many steps per epoch you have.
In this case its a small dataset, 100 files, so batches of 100. 1 step.
My gut feeling is that you'd want at least 100-200 epochs, but if your training set is close to a "standard" US English accent then you should be able to get away with less. How's the quality after 50?
IT just feels like its too fast, and so shouldn't be any good haha. No actual evidence to back that up!
This is my current latest graph, which is more like what (in my warped mind) I would expect it to look like.
But of course I'm making this up as I go. I'm not even totally sure if the yellow line is a good 'goal' for the green line to aim for. I have had some examples that went below the yellow line a lot, and ended up sounding terrible, so I've just assumed really.