Resuming Training keyError: 'iter' #429

Closed
opened 2023-10-25 11:45:44 +07:00 by toast · 1 comments

I am trying to resume a training and getting the following:

[Training] [2023-10-25T12:41:49.117575] Disabled distributed training.                                                                                                                [Training] [2023-10-25T12:41:49.119210] Traceback (most recent call last):                                                                                                            [Training] [2023-10-25T12:41:49.120834]   File "/home/toast/AI/ai-voice-cloning/./src/train.py", line 64, in <module>                                                                 [Training] [2023-10-25T12:41:49.122467]     train(config_path, args.launcher)                                                                                                         [Training] [2023-10-25T12:41:49.124102]   File "/home/toast/AI/ai-voice-cloning/./src/train.py", line 30, in train
[Training] [2023-10-25T12:41:49.125734]     trainer.init(config_path, opt, launcher, '')                                                                                              [Training] [2023-10-25T12:41:49.127349]   File "/home/toast/AI/ai-voice-cloning/modules/dlas/dlas/train.py", line 107, in init                                                        [Training] [2023-10-25T12:41:49.128991]     option.check_resume(opt, resume_state['iter'])
[Training] [2023-10-25T12:41:49.130837] KeyError: 'iter'

in my generate config tab i have a link to ./training/paisley/finetune/models/10514_gpt.pth (which does exist, i did check) in the Resume State Path

Any ideas?

I am trying to resume a training and getting the following: ```trace [Training] [2023-10-25T12:41:49.117575] Disabled distributed training. [Training] [2023-10-25T12:41:49.119210] Traceback (most recent call last): [Training] [2023-10-25T12:41:49.120834] File "/home/toast/AI/ai-voice-cloning/./src/train.py", line 64, in <module> [Training] [2023-10-25T12:41:49.122467] train(config_path, args.launcher) [Training] [2023-10-25T12:41:49.124102] File "/home/toast/AI/ai-voice-cloning/./src/train.py", line 30, in train [Training] [2023-10-25T12:41:49.125734] trainer.init(config_path, opt, launcher, '') [Training] [2023-10-25T12:41:49.127349] File "/home/toast/AI/ai-voice-cloning/modules/dlas/dlas/train.py", line 107, in init [Training] [2023-10-25T12:41:49.128991] option.check_resume(opt, resume_state['iter']) [Training] [2023-10-25T12:41:49.130837] KeyError: 'iter' ``` in my generate config tab i have a link to `./training/paisley/finetune/models/10514_gpt.pth` (which does exist, i did check) in the Resume State Path Any ideas?

Yup. I spotted it (eventually)

my resume state path was pointing to the model, not the state.

Yup. I spotted it (eventually) my resume state path was pointing to the model, not the state.
toast closed this issue 2023-10-25 21:10:05 +07:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#429
There is no content yet.