getting strange error when trying to train #371

Open
opened 2023-09-06 04:09:32 +07:00 by dillonbishop · 0 comments

this is the output I'm currently getting when trying to train it. not sure what else to share to help solve this (I've already tried reinstalling and using different cuda versions Im also using windows 10 and have a GTX 980 Ti):
Spawning process: train.bat ./training/markiplier/train.yaml
[Training] [2023-09-05T20:27:18.993858]
[Training] [2023-09-05T20:27:18.998854] (venv) D:\TorToise\ai-voice-cloning>call .\venv\Scripts\activate.bat
[Training] [2023-09-05T20:27:22.547885] NOTE: Redirects are currently not supported in Windows or MacOs.
[Training] [2023-09-05T20:27:26.897653] 23-09-05 20:27:26.897 - INFO: name: markiplier
[Training] [2023-09-05T20:27:26.902648] model: extensibletrainer
[Training] [2023-09-05T20:27:26.907642] scale: 1
[Training] [2023-09-05T20:27:26.912638] gpu_ids: [0]
[Training] [2023-09-05T20:27:26.919633] start_step: 0
[Training] [2023-09-05T20:27:26.924628] checkpointing_enabled: True
[Training] [2023-09-05T20:27:26.930623] fp16: False
[Training] [2023-09-05T20:27:26.935616] bitsandbytes: True
[Training] [2023-09-05T20:27:26.939614] gpus: 1
[Training] [2023-09-05T20:27:26.943609] datasets:[
[Training] [2023-09-05T20:27:26.948606] train:[
[Training] [2023-09-05T20:27:26.953601] name: training
[Training] [2023-09-05T20:27:26.957596] n_workers: 2
[Training] [2023-09-05T20:27:26.961593] batch_size: 128
[Training] [2023-09-05T20:27:26.966589] mode: paired_voice_audio
[Training] [2023-09-05T20:27:26.971584] path: ./training/markiplier/train.txt
[Training] [2023-09-05T20:27:26.976578] fetcher_mode: ['lj']
[Training] [2023-09-05T20:27:26.981575] phase: train
[Training] [2023-09-05T20:27:26.986569] max_wav_length: 255995
[Training] [2023-09-05T20:27:26.990565] max_text_length: 200
[Training] [2023-09-05T20:27:26.995561] sample_rate: 22050
[Training] [2023-09-05T20:27:26.999558] load_conditioning: True
[Training] [2023-09-05T20:27:27.004604] num_conditioning_candidates: 2
[Training] [2023-09-05T20:27:27.008550] conditioning_length: 44000
[Training] [2023-09-05T20:27:27.016542] use_bpe_tokenizer: True
[Training] [2023-09-05T20:27:27.020537] tokenizer_vocab: ./modules/tortoise-tts/tortoise/data/tokenizer.json
[Training] [2023-09-05T20:27:27.027533] load_aligned_codes: False
[Training] [2023-09-05T20:27:27.035524] data_type: img
[Training] [2023-09-05T20:27:27.041519] ]
[Training] [2023-09-05T20:27:27.047516] val:[
[Training] [2023-09-05T20:27:27.052509] name: validation
[Training] [2023-09-05T20:27:27.058503] n_workers: 2
[Training] [2023-09-05T20:27:27.064496] batch_size: 2
[Training] [2023-09-05T20:27:27.069493] mode: paired_voice_audio
[Training] [2023-09-05T20:27:27.073489] path: ./training/markiplier/validation.txt
[Training] [2023-09-05T20:27:27.078484] fetcher_mode: ['lj']
[Training] [2023-09-05T20:27:27.083480] phase: val
[Training] [2023-09-05T20:27:27.087476] max_wav_length: 255995
[Training] [2023-09-05T20:27:27.092471] max_text_length: 200
[Training] [2023-09-05T20:27:27.097467] sample_rate: 22050
[Training] [2023-09-05T20:27:27.102462] load_conditioning: True
[Training] [2023-09-05T20:27:27.106456] num_conditioning_candidates: 2
[Training] [2023-09-05T20:27:27.111452] conditioning_length: 44000
[Training] [2023-09-05T20:27:27.116447] use_bpe_tokenizer: True
[Training] [2023-09-05T20:27:27.120443] tokenizer_vocab: ./modules/tortoise-tts/tortoise/data/tokenizer.json
[Training] [2023-09-05T20:27:27.125440] load_aligned_codes: False
[Training] [2023-09-05T20:27:27.131434] data_type: img
[Training] [2023-09-05T20:27:27.136430] ]
[Training] [2023-09-05T20:27:27.141423] ]
[Training] [2023-09-05T20:27:27.146421] steps:[
[Training] [2023-09-05T20:27:27.150416] gpt_train:[
[Training] [2023-09-05T20:27:27.155413] training: gpt
[Training] [2023-09-05T20:27:27.161408] loss_log_buffer: 500
[Training] [2023-09-05T20:27:27.166402] optimizer: adamw
[Training] [2023-09-05T20:27:27.170399] optimizer_params:[
[Training] [2023-09-05T20:27:27.175393] lr: 1e-05
[Training] [2023-09-05T20:27:27.180388] weight_decay: 0.01
[Training] [2023-09-05T20:27:27.185384] beta1: 0.9
[Training] [2023-09-05T20:27:27.190381] beta2: 0.96
[Training] [2023-09-05T20:27:27.195374] ]
[Training] [2023-09-05T20:27:27.200371] clip_grad_eps: 4
[Training] [2023-09-05T20:27:27.205365] injectors:[
[Training] [2023-09-05T20:27:27.209360] paired_to_mel:[
[Training] [2023-09-05T20:27:27.214358] type: torch_mel_spectrogram
[Training] [2023-09-05T20:27:27.219353] mel_norm_file: ./modules/tortoise-tts/tortoise/data/mel_norms.pth
[Training] [2023-09-05T20:27:27.223348] in: wav
[Training] [2023-09-05T20:27:27.229343] out: paired_mel
[Training] [2023-09-05T20:27:27.234339] ]
[Training] [2023-09-05T20:27:27.238335] paired_cond_to_mel:[
[Training] [2023-09-05T20:27:27.243329] type: for_each
[Training] [2023-09-05T20:27:27.248326] subtype: torch_mel_spectrogram
[Training] [2023-09-05T20:27:27.253320] mel_norm_file: ./modules/tortoise-tts/tortoise/data/mel_norms.pth
[Training] [2023-09-05T20:27:27.258315] in: conditioning
[Training] [2023-09-05T20:27:27.263310] out: paired_conditioning_mel
[Training] [2023-09-05T20:27:27.269307] ]
[Training] [2023-09-05T20:27:27.274302] to_codes:[
[Training] [2023-09-05T20:27:27.280296] type: discrete_token
[Training] [2023-09-05T20:27:27.284291] in: paired_mel
[Training] [2023-09-05T20:27:27.289286] out: paired_mel_codes
[Training] [2023-09-05T20:27:27.295282] dvae_config: ./models/tortoise/train_diffusion_vocoder_22k_level.yml
[Training] [2023-09-05T20:27:27.300277] ]
[Training] [2023-09-05T20:27:27.304272] paired_fwd_text:[
[Training] [2023-09-05T20:27:27.309268] type: generator
[Training] [2023-09-05T20:27:27.314263] generator: gpt
[Training] [2023-09-05T20:27:27.319258] in: ['paired_conditioning_mel', 'padded_text', 'text_lengths', 'paired_mel_codes', 'wav_lengths']
[Training] [2023-09-05T20:27:27.325253] out: ['loss_text_ce', 'loss_mel_ce', 'logits']
[Training] [2023-09-05T20:27:27.330248] ]
[Training] [2023-09-05T20:27:27.335244] ]
[Training] [2023-09-05T20:27:27.340238] losses:[
[Training] [2023-09-05T20:27:27.345236] text_ce:[
[Training] [2023-09-05T20:27:27.350230] type: direct
[Training] [2023-09-05T20:27:27.354226] weight: 0.01
[Training] [2023-09-05T20:27:27.358222] key: loss_text_ce
[Training] [2023-09-05T20:27:27.362220] ]
[Training] [2023-09-05T20:27:27.367215] mel_ce:[
[Training] [2023-09-05T20:27:27.373208] type: direct
[Training] [2023-09-05T20:27:27.378204] weight: 1
[Training] [2023-09-05T20:27:27.383200] key: loss_mel_ce
[Training] [2023-09-05T20:27:27.388194] ]
[Training] [2023-09-05T20:27:27.392191] ]
[Training] [2023-09-05T20:27:27.397187] ]
[Training] [2023-09-05T20:27:27.402182] ]
[Training] [2023-09-05T20:27:27.406178] networks:[
[Training] [2023-09-05T20:27:27.411173] gpt:[
[Training] [2023-09-05T20:27:27.416167] type: generator
[Training] [2023-09-05T20:27:27.421163] which_model_G: unified_voice2
[Training] [2023-09-05T20:27:27.425160] kwargs:[
[Training] [2023-09-05T20:27:27.431154] layers: 30
[Training] [2023-09-05T20:27:27.435151] model_dim: 1024
[Training] [2023-09-05T20:27:27.440147] heads: 16
[Training] [2023-09-05T20:27:27.444144] max_text_tokens: 402
[Training] [2023-09-05T20:27:27.449137] max_mel_tokens: 604
[Training] [2023-09-05T20:27:27.453133] max_conditioning_inputs: 2
[Training] [2023-09-05T20:27:27.458130] mel_length_compression: 1024
[Training] [2023-09-05T20:27:27.464124] number_text_tokens: 256
[Training] [2023-09-05T20:27:27.469119] number_mel_codes: 8194
[Training] [2023-09-05T20:27:27.474115] start_mel_token: 8192
[Training] [2023-09-05T20:27:27.479111] stop_mel_token: 8193
[Training] [2023-09-05T20:27:27.484104] start_text_token: 255
[Training] [2023-09-05T20:27:27.489101] train_solo_embeddings: False
[Training] [2023-09-05T20:27:27.495094] use_mel_codes_as_input: True
[Training] [2023-09-05T20:27:27.500091] checkpointing: True
[Training] [2023-09-05T20:27:27.505084] tortoise_compat: True
[Training] [2023-09-05T20:27:27.510082] ]
[Training] [2023-09-05T20:27:27.515075] ]
[Training] [2023-09-05T20:27:27.520071] ]
[Training] [2023-09-05T20:27:27.525066] path:[
[Training] [2023-09-05T20:27:27.530062] strict_load: True
[Training] [2023-09-05T20:27:27.535057] pretrain_model_gpt: ./models/tortoise/autoregressive.pth
[Training] [2023-09-05T20:27:27.540052] root: ./
[Training] [2023-09-05T20:27:27.545049] experiments_root: ./training\markiplier\finetune
[Training] [2023-09-05T20:27:27.550044] models: ./training\markiplier\finetune\models
[Training] [2023-09-05T20:27:27.555039] training_state: ./training\markiplier\finetune\training_state
[Training] [2023-09-05T20:27:27.560034] log: ./training\markiplier\finetune
[Training] [2023-09-05T20:27:27.565028] val_images: ./training\markiplier\finetune\val_images
[Training] [2023-09-05T20:27:27.569025] ]
[Training] [2023-09-05T20:27:27.574021] train:[
[Training] [2023-09-05T20:27:27.579016] niter: 500
[Training] [2023-09-05T20:27:27.584012] warmup_iter: -1
[Training] [2023-09-05T20:27:27.588008] mega_batch_factor: 64
[Training] [2023-09-05T20:27:27.593005] val_freq: 25
[Training] [2023-09-05T20:27:27.599996] ema_enabled: False
[Training] [2023-09-05T20:27:27.604992] default_lr_scheme: MultiStepLR
[Training] [2023-09-05T20:27:27.609986] gen_lr_steps: [10, 20, 45, 90, 125, 165, 250]
[Training] [2023-09-05T20:27:27.614983] lr_gamma: 0.5
[Training] [2023-09-05T20:27:27.619979] ]
[Training] [2023-09-05T20:27:27.623975] eval:[
[Training] [2023-09-05T20:27:27.629970] pure: False
[Training] [2023-09-05T20:27:27.633966] output_state: gen
[Training] [2023-09-05T20:27:27.638961] ]
[Training] [2023-09-05T20:27:27.642957] logger:[
[Training] [2023-09-05T20:27:27.647952] save_checkpoint_freq: 25
[Training] [2023-09-05T20:27:27.652946] visuals: ['gen', 'mel']
[Training] [2023-09-05T20:27:27.658942] visual_debug_rate: 25
[Training] [2023-09-05T20:27:27.663936] is_mel_spectrogram: True
[Training] [2023-09-05T20:27:27.668931] ]
[Training] [2023-09-05T20:27:27.672928] is_train: True
[Training] [2023-09-05T20:27:27.676924] dist: False
[Training] [2023-09-05T20:27:27.682919]
[Training] [2023-09-05T20:27:27.686916] 23-09-05 20:27:26.897 - INFO: Random seed: 8574
[Training] [2023-09-05T20:27:29.059635] 23-09-05 20:27:29.059 - INFO: Number of training data elements: 541, iters: 5
[Training] [2023-09-05T20:27:29.065629] 23-09-05 20:27:29.059 - INFO: Total epochs needed: 100 for iters 500
[Training] [2023-09-05T20:27:31.678695] D:\TorToise\ai-voice-cloning\venv\lib\site-packages\transformers\configuration_utils.py:363: UserWarning: Passing gradient_checkpointing to a config initialization is deprecated and will be removed in v5 Transformers. Using model.gradient_checkpointing_enable() instead, or if you are using the Trainer API, pass gradient_checkpointing=True in your TrainingArguments.
[Training] [2023-09-05T20:27:31.686686] warnings.warn(
[Training] [2023-09-05T20:28:12.404770] 23-09-05 20:28:12.403 - INFO: Loading model for [./models/tortoise/autoregressive.pth]
[Training] [2023-09-05T20:28:13.941333] 23-09-05 20:28:13.933 - INFO: Start training from epoch: 0, iter: 0
[Training] [2023-09-05T20:28:17.386937] NOTE: Redirects are currently not supported in Windows or MacOs.
[Training] [2023-09-05T20:28:21.151689] NOTE: Redirects are currently not supported in Windows or MacOs.
[Training] [2023-09-05T20:28:22.950009] D:\TorToise\ai-voice-cloning\venv\lib\site-packages\torch\optim\lr_scheduler.py:139: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of
the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
[Training] [2023-09-05T20:28:22.950009] warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). "
[Training] [2023-09-05T20:55:58.183015] Error no kernel image is available for execution on the device at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu

this is the output I'm currently getting when trying to train it. not sure what else to share to help solve this (I've already tried reinstalling and using different cuda versions Im also using windows 10 and have a GTX 980 Ti): Spawning process: train.bat ./training/markiplier/train.yaml [Training] [2023-09-05T20:27:18.993858] [Training] [2023-09-05T20:27:18.998854] (venv) D:\TorToise\ai-voice-cloning>call .\venv\Scripts\activate.bat [Training] [2023-09-05T20:27:22.547885] NOTE: Redirects are currently not supported in Windows or MacOs. [Training] [2023-09-05T20:27:26.897653] 23-09-05 20:27:26.897 - INFO: name: markiplier [Training] [2023-09-05T20:27:26.902648] model: extensibletrainer [Training] [2023-09-05T20:27:26.907642] scale: 1 [Training] [2023-09-05T20:27:26.912638] gpu_ids: [0] [Training] [2023-09-05T20:27:26.919633] start_step: 0 [Training] [2023-09-05T20:27:26.924628] checkpointing_enabled: True [Training] [2023-09-05T20:27:26.930623] fp16: False [Training] [2023-09-05T20:27:26.935616] bitsandbytes: True [Training] [2023-09-05T20:27:26.939614] gpus: 1 [Training] [2023-09-05T20:27:26.943609] datasets:[ [Training] [2023-09-05T20:27:26.948606] train:[ [Training] [2023-09-05T20:27:26.953601] name: training [Training] [2023-09-05T20:27:26.957596] n_workers: 2 [Training] [2023-09-05T20:27:26.961593] batch_size: 128 [Training] [2023-09-05T20:27:26.966589] mode: paired_voice_audio [Training] [2023-09-05T20:27:26.971584] path: ./training/markiplier/train.txt [Training] [2023-09-05T20:27:26.976578] fetcher_mode: ['lj'] [Training] [2023-09-05T20:27:26.981575] phase: train [Training] [2023-09-05T20:27:26.986569] max_wav_length: 255995 [Training] [2023-09-05T20:27:26.990565] max_text_length: 200 [Training] [2023-09-05T20:27:26.995561] sample_rate: 22050 [Training] [2023-09-05T20:27:26.999558] load_conditioning: True [Training] [2023-09-05T20:27:27.004604] num_conditioning_candidates: 2 [Training] [2023-09-05T20:27:27.008550] conditioning_length: 44000 [Training] [2023-09-05T20:27:27.016542] use_bpe_tokenizer: True [Training] [2023-09-05T20:27:27.020537] tokenizer_vocab: ./modules/tortoise-tts/tortoise/data/tokenizer.json [Training] [2023-09-05T20:27:27.027533] load_aligned_codes: False [Training] [2023-09-05T20:27:27.035524] data_type: img [Training] [2023-09-05T20:27:27.041519] ] [Training] [2023-09-05T20:27:27.047516] val:[ [Training] [2023-09-05T20:27:27.052509] name: validation [Training] [2023-09-05T20:27:27.058503] n_workers: 2 [Training] [2023-09-05T20:27:27.064496] batch_size: 2 [Training] [2023-09-05T20:27:27.069493] mode: paired_voice_audio [Training] [2023-09-05T20:27:27.073489] path: ./training/markiplier/validation.txt [Training] [2023-09-05T20:27:27.078484] fetcher_mode: ['lj'] [Training] [2023-09-05T20:27:27.083480] phase: val [Training] [2023-09-05T20:27:27.087476] max_wav_length: 255995 [Training] [2023-09-05T20:27:27.092471] max_text_length: 200 [Training] [2023-09-05T20:27:27.097467] sample_rate: 22050 [Training] [2023-09-05T20:27:27.102462] load_conditioning: True [Training] [2023-09-05T20:27:27.106456] num_conditioning_candidates: 2 [Training] [2023-09-05T20:27:27.111452] conditioning_length: 44000 [Training] [2023-09-05T20:27:27.116447] use_bpe_tokenizer: True [Training] [2023-09-05T20:27:27.120443] tokenizer_vocab: ./modules/tortoise-tts/tortoise/data/tokenizer.json [Training] [2023-09-05T20:27:27.125440] load_aligned_codes: False [Training] [2023-09-05T20:27:27.131434] data_type: img [Training] [2023-09-05T20:27:27.136430] ] [Training] [2023-09-05T20:27:27.141423] ] [Training] [2023-09-05T20:27:27.146421] steps:[ [Training] [2023-09-05T20:27:27.150416] gpt_train:[ [Training] [2023-09-05T20:27:27.155413] training: gpt [Training] [2023-09-05T20:27:27.161408] loss_log_buffer: 500 [Training] [2023-09-05T20:27:27.166402] optimizer: adamw [Training] [2023-09-05T20:27:27.170399] optimizer_params:[ [Training] [2023-09-05T20:27:27.175393] lr: 1e-05 [Training] [2023-09-05T20:27:27.180388] weight_decay: 0.01 [Training] [2023-09-05T20:27:27.185384] beta1: 0.9 [Training] [2023-09-05T20:27:27.190381] beta2: 0.96 [Training] [2023-09-05T20:27:27.195374] ] [Training] [2023-09-05T20:27:27.200371] clip_grad_eps: 4 [Training] [2023-09-05T20:27:27.205365] injectors:[ [Training] [2023-09-05T20:27:27.209360] paired_to_mel:[ [Training] [2023-09-05T20:27:27.214358] type: torch_mel_spectrogram [Training] [2023-09-05T20:27:27.219353] mel_norm_file: ./modules/tortoise-tts/tortoise/data/mel_norms.pth [Training] [2023-09-05T20:27:27.223348] in: wav [Training] [2023-09-05T20:27:27.229343] out: paired_mel [Training] [2023-09-05T20:27:27.234339] ] [Training] [2023-09-05T20:27:27.238335] paired_cond_to_mel:[ [Training] [2023-09-05T20:27:27.243329] type: for_each [Training] [2023-09-05T20:27:27.248326] subtype: torch_mel_spectrogram [Training] [2023-09-05T20:27:27.253320] mel_norm_file: ./modules/tortoise-tts/tortoise/data/mel_norms.pth [Training] [2023-09-05T20:27:27.258315] in: conditioning [Training] [2023-09-05T20:27:27.263310] out: paired_conditioning_mel [Training] [2023-09-05T20:27:27.269307] ] [Training] [2023-09-05T20:27:27.274302] to_codes:[ [Training] [2023-09-05T20:27:27.280296] type: discrete_token [Training] [2023-09-05T20:27:27.284291] in: paired_mel [Training] [2023-09-05T20:27:27.289286] out: paired_mel_codes [Training] [2023-09-05T20:27:27.295282] dvae_config: ./models/tortoise/train_diffusion_vocoder_22k_level.yml [Training] [2023-09-05T20:27:27.300277] ] [Training] [2023-09-05T20:27:27.304272] paired_fwd_text:[ [Training] [2023-09-05T20:27:27.309268] type: generator [Training] [2023-09-05T20:27:27.314263] generator: gpt [Training] [2023-09-05T20:27:27.319258] in: ['paired_conditioning_mel', 'padded_text', 'text_lengths', 'paired_mel_codes', 'wav_lengths'] [Training] [2023-09-05T20:27:27.325253] out: ['loss_text_ce', 'loss_mel_ce', 'logits'] [Training] [2023-09-05T20:27:27.330248] ] [Training] [2023-09-05T20:27:27.335244] ] [Training] [2023-09-05T20:27:27.340238] losses:[ [Training] [2023-09-05T20:27:27.345236] text_ce:[ [Training] [2023-09-05T20:27:27.350230] type: direct [Training] [2023-09-05T20:27:27.354226] weight: 0.01 [Training] [2023-09-05T20:27:27.358222] key: loss_text_ce [Training] [2023-09-05T20:27:27.362220] ] [Training] [2023-09-05T20:27:27.367215] mel_ce:[ [Training] [2023-09-05T20:27:27.373208] type: direct [Training] [2023-09-05T20:27:27.378204] weight: 1 [Training] [2023-09-05T20:27:27.383200] key: loss_mel_ce [Training] [2023-09-05T20:27:27.388194] ] [Training] [2023-09-05T20:27:27.392191] ] [Training] [2023-09-05T20:27:27.397187] ] [Training] [2023-09-05T20:27:27.402182] ] [Training] [2023-09-05T20:27:27.406178] networks:[ [Training] [2023-09-05T20:27:27.411173] gpt:[ [Training] [2023-09-05T20:27:27.416167] type: generator [Training] [2023-09-05T20:27:27.421163] which_model_G: unified_voice2 [Training] [2023-09-05T20:27:27.425160] kwargs:[ [Training] [2023-09-05T20:27:27.431154] layers: 30 [Training] [2023-09-05T20:27:27.435151] model_dim: 1024 [Training] [2023-09-05T20:27:27.440147] heads: 16 [Training] [2023-09-05T20:27:27.444144] max_text_tokens: 402 [Training] [2023-09-05T20:27:27.449137] max_mel_tokens: 604 [Training] [2023-09-05T20:27:27.453133] max_conditioning_inputs: 2 [Training] [2023-09-05T20:27:27.458130] mel_length_compression: 1024 [Training] [2023-09-05T20:27:27.464124] number_text_tokens: 256 [Training] [2023-09-05T20:27:27.469119] number_mel_codes: 8194 [Training] [2023-09-05T20:27:27.474115] start_mel_token: 8192 [Training] [2023-09-05T20:27:27.479111] stop_mel_token: 8193 [Training] [2023-09-05T20:27:27.484104] start_text_token: 255 [Training] [2023-09-05T20:27:27.489101] train_solo_embeddings: False [Training] [2023-09-05T20:27:27.495094] use_mel_codes_as_input: True [Training] [2023-09-05T20:27:27.500091] checkpointing: True [Training] [2023-09-05T20:27:27.505084] tortoise_compat: True [Training] [2023-09-05T20:27:27.510082] ] [Training] [2023-09-05T20:27:27.515075] ] [Training] [2023-09-05T20:27:27.520071] ] [Training] [2023-09-05T20:27:27.525066] path:[ [Training] [2023-09-05T20:27:27.530062] strict_load: True [Training] [2023-09-05T20:27:27.535057] pretrain_model_gpt: ./models/tortoise/autoregressive.pth [Training] [2023-09-05T20:27:27.540052] root: ./ [Training] [2023-09-05T20:27:27.545049] experiments_root: ./training\markiplier\finetune [Training] [2023-09-05T20:27:27.550044] models: ./training\markiplier\finetune\models [Training] [2023-09-05T20:27:27.555039] training_state: ./training\markiplier\finetune\training_state [Training] [2023-09-05T20:27:27.560034] log: ./training\markiplier\finetune [Training] [2023-09-05T20:27:27.565028] val_images: ./training\markiplier\finetune\val_images [Training] [2023-09-05T20:27:27.569025] ] [Training] [2023-09-05T20:27:27.574021] train:[ [Training] [2023-09-05T20:27:27.579016] niter: 500 [Training] [2023-09-05T20:27:27.584012] warmup_iter: -1 [Training] [2023-09-05T20:27:27.588008] mega_batch_factor: 64 [Training] [2023-09-05T20:27:27.593005] val_freq: 25 [Training] [2023-09-05T20:27:27.599996] ema_enabled: False [Training] [2023-09-05T20:27:27.604992] default_lr_scheme: MultiStepLR [Training] [2023-09-05T20:27:27.609986] gen_lr_steps: [10, 20, 45, 90, 125, 165, 250] [Training] [2023-09-05T20:27:27.614983] lr_gamma: 0.5 [Training] [2023-09-05T20:27:27.619979] ] [Training] [2023-09-05T20:27:27.623975] eval:[ [Training] [2023-09-05T20:27:27.629970] pure: False [Training] [2023-09-05T20:27:27.633966] output_state: gen [Training] [2023-09-05T20:27:27.638961] ] [Training] [2023-09-05T20:27:27.642957] logger:[ [Training] [2023-09-05T20:27:27.647952] save_checkpoint_freq: 25 [Training] [2023-09-05T20:27:27.652946] visuals: ['gen', 'mel'] [Training] [2023-09-05T20:27:27.658942] visual_debug_rate: 25 [Training] [2023-09-05T20:27:27.663936] is_mel_spectrogram: True [Training] [2023-09-05T20:27:27.668931] ] [Training] [2023-09-05T20:27:27.672928] is_train: True [Training] [2023-09-05T20:27:27.676924] dist: False [Training] [2023-09-05T20:27:27.682919] [Training] [2023-09-05T20:27:27.686916] 23-09-05 20:27:26.897 - INFO: Random seed: 8574 [Training] [2023-09-05T20:27:29.059635] 23-09-05 20:27:29.059 - INFO: Number of training data elements: 541, iters: 5 [Training] [2023-09-05T20:27:29.065629] 23-09-05 20:27:29.059 - INFO: Total epochs needed: 100 for iters 500 [Training] [2023-09-05T20:27:31.678695] D:\TorToise\ai-voice-cloning\venv\lib\site-packages\transformers\configuration_utils.py:363: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`. [Training] [2023-09-05T20:27:31.686686] warnings.warn( [Training] [2023-09-05T20:28:12.404770] 23-09-05 20:28:12.403 - INFO: Loading model for [./models/tortoise/autoregressive.pth] [Training] [2023-09-05T20:28:13.941333] 23-09-05 20:28:13.933 - INFO: Start training from epoch: 0, iter: 0 [Training] [2023-09-05T20:28:17.386937] NOTE: Redirects are currently not supported in Windows or MacOs. [Training] [2023-09-05T20:28:21.151689] NOTE: Redirects are currently not supported in Windows or MacOs. [Training] [2023-09-05T20:28:22.950009] D:\TorToise\ai-voice-cloning\venv\lib\site-packages\torch\optim\lr_scheduler.py:139: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate [Training] [2023-09-05T20:28:22.950009] warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. " [Training] [2023-09-05T20:55:58.183015] Error no kernel image is available for execution on the device at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#371
There is no content yet.