'list index out of range' error #159
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#159
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I was trying to use a custom schedule based on another training program (figured it was worth experimenting with) - [25. 50, 100, 200]
but it throws an error and won't work -
[Training] [2023-03-20T06:32:55.052116] Using BitsAndBytes optimizations
[Training] [2023-03-20T06:32:55.052116] Disabled distributed training.
[Training] [2023-03-20T06:32:55.052116] Path already exists. Rename it to [./training\patrick\finetune_archived_230320-063221]
[Training] [2023-03-20T06:32:55.052116] Loading from ./models/tortoise/dvae.pth
[Training] [2023-03-20T06:32:55.052116] Traceback (most recent call last):
[Training] [2023-03-20T06:32:55.052116] File "C:\Users\nirin\Desktop\AIVoice\ai-voice-cloning\src\train.py", line 68, in
[Training] [2023-03-20T06:32:55.052116] train(config_path, args.launcher)
[Training] [2023-03-20T06:32:55.052116] File "C:\Users\nirin\Desktop\AIVoice\ai-voice-cloning\src\train.py", line 35, in train
[Training] [2023-03-20T06:32:55.052116] trainer.do_training()
[Training] [2023-03-20T06:32:55.052116] File "C:\Users\nirin\Desktop\AIVoice\ai-voice-cloning./modules/dlas\codes\train.py", line 374, in do_training
[Training] [2023-03-20T06:32:55.053116] metric = self.do_step(train_data)
[Training] [2023-03-20T06:32:55.053116] File "C:\Users\nirin\Desktop\AIVoice\ai-voice-cloning./modules/dlas\codes\train.py", line 242, in do_step
[Training] [2023-03-20T06:32:55.053116] gradient_norms_dict = self.model.optimize_parameters(self.current_step, return_grad_norms=will_log)
[Training] [2023-03-20T06:32:55.053116] File "C:\Users\nirin\Desktop\AIVoice\ai-voice-cloning./modules/dlas/codes\trainer\ExtensibleTrainer.py", line 303, in optimize_parameters
[Training] [2023-03-20T06:32:55.053116] ns = step.do_forward_backward(state, m, step_num, train=train_step, no_ddp_sync=(m+1 < self.batch_factor))
[Training] [2023-03-20T06:32:55.053116] File "C:\Users\nirin\Desktop\AIVoice\ai-voice-cloning./modules/dlas/codes\trainer\steps.py", line 220, in do_forward_backward
[Training] [2023-03-20T06:32:55.053116] local_state[k] = v[grad_accum_step]
[Training] [2023-03-20T06:32:55.053116] IndexError: list index out of range
custom LR schedule causing 'list index out of range' errorto 'list index out of range' errorDid a forced reinstall but no change. Still won't work with a custom-entered schedule for some reason. I know it used to work. Not sure what's going on.
Your gradient accumulation size is either too large or not divisible enough by your batch size.
Really? I'm using the same gradient size for both attempts, and the default scheduling will train. It only doesn't train if I use custom scheduling. Unless it's somehow borderline and the different scheduling uses more ram or something?!
Seems its working now so must have been it! tthanks