Graphs follow the wrong number (steps instead of epochs) #234

Closed
opened 2023-05-11 03:06:42 +00:00 by nirurin · 3 comments

As you can see in the file, I'm doing a training of 500 epochs, but each epoch is like 18 iterations so its 9000 steps in total.

The graph is showing the current number of steps (133), but the graph axis stops at 500 (the number of epochs) instead of 9000.

Meaning if the training ends when the graph gets to the end, it would only have done 500 steps, instead of 500 epochs..(?)

The backend is dropping the schedule at the correct times it seems (by epochs) so it seems it's just the graphs are wrong, but I think when I've done this before it stops at the '500 steps' mark, even though thats only 500 out of 9000. So something thinks the steps are epochs.

As you can see in the file, I'm doing a training of 500 epochs, but each epoch is like 18 iterations so its 9000 steps in total. The graph is showing the current number of steps (133), but the graph axis stops at 500 (the number of epochs) instead of 9000. Meaning if the training ends when the graph gets to the end, it would only have done 500 steps, instead of 500 epochs..(?) The backend is dropping the schedule at the correct times it seems (by epochs) so it seems it's just the graphs are wrong, but I think when I've done this before it stops at the '500 steps' mark, even though thats only 500 out of 9000. So something thinks the steps are epochs.
Owner

Oops, somehow I had that change make its way upstream. For my VALL-E training, I needed it to show by iteration step rather than epoch count, and my local change on my training machine somehow made its way to the repo.

Should be fixed in commit 74bd0f0cdc, but you can set your X Max to 9000 and it should re-scale. I think a default of 0 will check your training settings and set itself to 500, thinking it's 500 epochs and not 9000 steps.

Oops, somehow I had that change make its way upstream. For my VALL-E training, I needed it to show by iteration step rather than epoch count, and my local change on my training machine somehow made its way to the repo. Should be fixed in commit 74bd0f0cdce350e9ca30b937fb5fc7b3d17242fb, but you can set your `X Max` to 9000 and it should re-scale. I think a default of 0 will check your training settings and set itself to `500`, thinking it's 500 epochs and not 9000 steps.
Author

No problem, I did think it seemed weird!

How's the vall-e experience going? Is it proving to be better?

edit: I updated and now I'm getting -

AttributeError: module 'numpy' has no attribute 'complex'.
np.complex was a deprecated alias for the builtin complex. To avoid this error in existing code, use complex by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.complex128 here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'complex_'?
Press any key to continue . . .

No problem, I did think it seemed weird! How's the vall-e experience going? Is it proving to be better? edit: I updated and now I'm getting - AttributeError: module 'numpy' has no attribute 'complex'. `np.complex` was a deprecated alias for the builtin `complex`. To avoid this error in existing code, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'complex_'? Press any key to continue . . .
Owner

You'll need to do something like (after activating the venv):

pip3 install -U numpy==1.23.5

I could have sworn I had the version frozen for this repo, but I can't for the life of me find which repo ended up having it.

You'll need to do something like (after activating the venv): ``` pip3 install -U numpy==1.23.5 ``` I could have sworn I had the version frozen for this repo, but I can't for the life of me find which repo ended up having it.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#234
No description provided.