Is Cos.Annealing ever a better option? #151

Closed
opened 2023-03-17 22:04:22 +00:00 by nirurin · 5 comments

I often find multistep either learns too quickly (the graphs drops fast) or only gets as far as about 1.2 before plateauing (and probably not getting down to ~0.5 on the graph until like epoch 20,000 or something.

Though I'm still not sure what a good graph should be. It feels like a gradual curve down to 0.5ish over 1000-2000 epochs should give strong results, but either the curve drops down within about 100 epochs, or it flattens out and never drops lol.

Cos. Annealing tests have let me ... kind of make this curve, but the results so far haven't been stellar.

Maybe 50 epochs is enough?..

I often find multistep either learns too quickly (the graphs drops fast) or only gets as far as about 1.2 before plateauing (and probably not getting down to ~0.5 on the graph until like epoch 20,000 or something. Though I'm still not sure what a good graph should be. It feels like a gradual curve down to 0.5ish over 1000-2000 epochs should give strong results, but either the curve drops down within about 100 epochs, or it flattens out and never drops lol. Cos. Annealing tests have let me ... kind of make this curve, but the results so far haven't been stellar. Maybe 50 epochs is enough?..
Author

image

LR = 0.00005

![image](/attachments/0cc224d5-069b-47b3-986e-ee32dff20b19) LR = 0.00005

Maybe 50 epochs is enough?..

Hard to say without knowing your batch size and how many steps per epoch you have.

> Maybe 50 epochs is enough?.. Hard to say without knowing your batch size and how many steps per epoch you have.
Author

Maybe 50 epochs is enough?..

Hard to say without knowing your batch size and how many steps per epoch you have.

In this case its a small dataset, 100 files, so batches of 100. 1 step.

> > Maybe 50 epochs is enough?.. > > Hard to say without knowing your batch size and how many steps per epoch you have. > In this case its a small dataset, 100 files, so batches of 100. 1 step.

My gut feeling is that you'd want at least 100-200 epochs, but if your training set is close to a "standard" US English accent then you should be able to get away with less. How's the quality after 50?

My gut feeling is that you'd want at least 100-200 epochs, but if your training set is close to a "standard" US English accent then you should be able to get away with less. How's the quality after 50?
Author

IT just feels like its too fast, and so shouldn't be any good haha. No actual evidence to back that up!

This is my current latest graph, which is more like what (in my warped mind) I would expect it to look like.

But of course I'm making this up as I go. I'm not even totally sure if the yellow line is a good 'goal' for the green line to aim for. I have had some examples that went below the yellow line a lot, and ended up sounding terrible, so I've just assumed really.

IT just feels like its too fast, and so shouldn't be any good haha. No actual evidence to back that up! This is my current latest graph, which is more like what (in my warped mind) I would expect it to look like. But of course I'm making this up as I go. I'm not even totally sure if the yellow line is a good 'goal' for the green line to aim for. I have had some examples that went below the yellow line a lot, and ended up sounding terrible, so I've just assumed really.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#151
No description provided.