training has bunch of errors and warnings not sure new system #304

Open
opened 2023-07-10 08:27:26 +00:00 by aphixe2000 · 2 comments

so i tried my laptop and since it didn't have anything much installed i figured it would be a good case. it has windows 11. this is error i see when i hit train button.

image

so i tried my laptop and since it didn't have anything much installed i figured it would be a good case. it has windows 11. this is error i see when i hit train button. ![image](/attachments/c9bd5948-3942-41ba-9321-c7e7d912568d)
350 KiB
Owner

All of the warnings are fine, they haven't caused me any issues months ago when I used it last.

Your main error is, because of the way DLAS is written, your gradient accumulation size being too large for a given batch size. https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Issues#local_state-k-v-grad_accum_step-indexerror-list-index-out-of-range

Your Gradiant Accumulation Size is too large for your given Batch Size. Please reduce it to, at most, half your batch size, or use the validation button to correct this.

All of the warnings are fine, they haven't caused me any issues months ago when I used it last. Your main error is, because of the way DLAS is written, your gradient accumulation size being too large for a given batch size. https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Issues#local_state-k-v-grad_accum_step-indexerror-list-index-out-of-range > Your Gradiant Accumulation Size is too large for your given Batch Size. Please reduce it to, at most, half your batch size, or use the validation button to correct this.

l#how-to-adjust-learning-rate
[Training] [2023-08-14T18:08:43.970009] warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). "
[Training] [2023-08-14T18:09:16.715182] Disabled distributed training.
[Training] [2023-08-14T18:09:16.715182] Loading from ./models/tortoise/dvae.pth
[Training] [2023-08-14T18:09:16.716181] Traceback (most recent call last):
[Training] [2023-08-14T18:09:16.716181] File "E:\KL2.0\CODEZ\tts-web\ai-voice-cloning\src\train.py", line 64, in
[Training] [2023-08-14T18:09:16.716181] train(config_path, args.launcher)
[Training] [2023-08-14T18:09:16.716181] File "E:\KL2.0\CODEZ\tts-web\ai-voice-cloning\src\train.py", line 31, in train
[Training] [2023-08-14T18:09:16.716181] trainer.do_training()
[Training] [2023-08-14T18:09:16.716181] File "e:\kl2.0\codez\tts-web\ai-voice-cloning\modules\dlas\dlas\train.py", line 408, in do_training
[Training] [2023-08-14T18:09:16.716181] metric = self.do_step(train_data)
[Training] [2023-08-14T18:09:16.716181] File "e:\kl2.0\codez\tts-web\ai-voice-cloning\modules\dlas\dlas\train.py", line 271, in do_step
[Training] [2023-08-14T18:09:16.717182] gradient_norms_dict = self.model.optimize_parameters(
[Training] [2023-08-14T18:09:16.717182] File "e:\kl2.0\codez\tts-web\ai-voice-cloning\modules\dlas\dlas\trainer\ExtensibleTrainer.py", line 321, in optimize_parameters
[Training] [2023-08-14T18:09:16.717182] ns = step.do_forward_backward(
[Training] [2023-08-14T18:09:16.717182] File "e:\kl2.0\codez\tts-web\ai-voice-cloning\modules\dlas\dlas\trainer\steps.py", line 242, in do_forward_backward
[Training] [2023-08-14T18:09:16.717182] local_state[k] = v[grad_accum_step]
[Training] [2023-08-14T18:09:16.717182] IndexError: list index out of range

i am also getting the same issue!, how to solve this?

l#how-to-adjust-learning-rate [Training] [2023-08-14T18:08:43.970009] warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. " [Training] [2023-08-14T18:09:16.715182] Disabled distributed training. [Training] [2023-08-14T18:09:16.715182] Loading from ./models/tortoise/dvae.pth [Training] [2023-08-14T18:09:16.716181] Traceback (most recent call last): [Training] [2023-08-14T18:09:16.716181] File "E:\KL2.0\CODEZ\tts-web\ai-voice-cloning\src\train.py", line 64, in <module> [Training] [2023-08-14T18:09:16.716181] train(config_path, args.launcher) [Training] [2023-08-14T18:09:16.716181] File "E:\KL2.0\CODEZ\tts-web\ai-voice-cloning\src\train.py", line 31, in train [Training] [2023-08-14T18:09:16.716181] trainer.do_training() [Training] [2023-08-14T18:09:16.716181] File "e:\kl2.0\codez\tts-web\ai-voice-cloning\modules\dlas\dlas\train.py", line 408, in do_training [Training] [2023-08-14T18:09:16.716181] metric = self.do_step(train_data) [Training] [2023-08-14T18:09:16.716181] File "e:\kl2.0\codez\tts-web\ai-voice-cloning\modules\dlas\dlas\train.py", line 271, in do_step [Training] [2023-08-14T18:09:16.717182] gradient_norms_dict = self.model.optimize_parameters( [Training] [2023-08-14T18:09:16.717182] File "e:\kl2.0\codez\tts-web\ai-voice-cloning\modules\dlas\dlas\trainer\ExtensibleTrainer.py", line 321, in optimize_parameters [Training] [2023-08-14T18:09:16.717182] ns = step.do_forward_backward( [Training] [2023-08-14T18:09:16.717182] File "e:\kl2.0\codez\tts-web\ai-voice-cloning\modules\dlas\dlas\trainer\steps.py", line 242, in do_forward_backward [Training] [2023-08-14T18:09:16.717182] local_state[k] = v[grad_accum_step] [Training] [2023-08-14T18:09:16.717182] IndexError: list index out of range i am also getting the same issue!, how to solve this?
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#304
No description provided.