Error when running training #136

Closed
opened 2023-03-15 00:50:35 +00:00 by ThrowawayAccount01 · 1 comment

I get this error when trying to run training:

C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning>call .\venv\Scripts\activate.bat
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Spawning process:  train.bat ./training/21/train.yaml
[Training] [2023-03-15T08:47:10.664831]
[Training] [2023-03-15T08:47:10.667833] (venv) C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning>call .\venv\Scripts\activate.bat
[Training] [2023-03-15T08:47:11.617364] NOTE: Redirects are currently not supported in Windows or MacOs.
[Training] [2023-03-15T08:47:13.161874] 23-03-15 08:47:13.161 - INFO:   name: 21
[Training] [2023-03-15T08:47:13.164876]   model: extensibletrainer
[Training] [2023-03-15T08:47:13.168878]   scale: 1
[Training] [2023-03-15T08:47:13.170876]   gpu_ids: [0]
[Training] [2023-03-15T08:47:13.176181]   start_step: 0
[Training] [2023-03-15T08:47:13.179181]   checkpointing_enabled: True
[Training] [2023-03-15T08:47:13.181693]   fp16: False
[Training] [2023-03-15T08:47:13.184695]   bitsandbytes: True
[Training] [2023-03-15T08:47:13.186697]   gpus: 1
[Training] [2023-03-15T08:47:13.189695]   datasets:[
[Training] [2023-03-15T08:47:13.193208]     train:[
[Training] [2023-03-15T08:47:13.196209]       name: training
[Training] [2023-03-15T08:47:13.199211]       n_workers: 2
[Training] [2023-03-15T08:47:13.202733]       batch_size: 25
[Training] [2023-03-15T08:47:13.204732]       mode: paired_voice_audio
[Training] [2023-03-15T08:47:13.207738]       path: ./training/21/train.txt
[Training] [2023-03-15T08:47:13.211240]       fetcher_mode: ['lj']
[Training] [2023-03-15T08:47:13.215246]       phase: train
[Training] [2023-03-15T08:47:13.218248]       max_wav_length: 255995
[Training] [2023-03-15T08:47:13.220763]       max_text_length: 200
[Training] [2023-03-15T08:47:13.223775]       sample_rate: 22050
[Training] [2023-03-15T08:47:13.225776]       load_conditioning: True
[Training] [2023-03-15T08:47:13.229776]       num_conditioning_candidates: 2
[Training] [2023-03-15T08:47:13.232300]       conditioning_length: 44000
[Training] [2023-03-15T08:47:13.234303]       use_bpe_tokenizer: True
[Training] [2023-03-15T08:47:13.237299]       tokenizer_vocab: ${tokenizer_json}
[Training] [2023-03-15T08:47:13.239300]       load_aligned_codes: False
[Training] [2023-03-15T08:47:13.242822]       data_type: img
[Training] [2023-03-15T08:47:13.246819]     ]
[Training] [2023-03-15T08:47:13.250819]     val:[
[Training] [2023-03-15T08:47:13.254264]       name: validation
[Training] [2023-03-15T08:47:13.256779]       n_workers: 2
[Training] [2023-03-15T08:47:13.259777]       batch_size: 6
[Training] [2023-03-15T08:47:13.263291]       mode: paired_voice_audio
[Training] [2023-03-15T08:47:13.266292]       path: ./training/21/validation.txt
[Training] [2023-03-15T08:47:13.268294]       fetcher_mode: ['lj']
[Training] [2023-03-15T08:47:13.270795]       phase: val
[Training] [2023-03-15T08:47:13.273802]       max_wav_length: 255995
[Training] [2023-03-15T08:47:13.276801]       max_text_length: 200
[Training] [2023-03-15T08:47:13.279802]       sample_rate: 22050
[Training] [2023-03-15T08:47:13.283318]       load_conditioning: True
[Training] [2023-03-15T08:47:13.285316]       num_conditioning_candidates: 2
[Training] [2023-03-15T08:47:13.288315]       conditioning_length: 44000
[Training] [2023-03-15T08:47:13.290821]       use_bpe_tokenizer: True
[Training] [2023-03-15T08:47:13.293831]       tokenizer_vocab: ${tokenizer_json}
[Training] [2023-03-15T08:47:13.296827]       load_aligned_codes: False
[Training] [2023-03-15T08:47:13.298829]       data_type: img
[Training] [2023-03-15T08:47:13.302341]     ]
[Training] [2023-03-15T08:47:13.304340]   ]
[Training] [2023-03-15T08:47:13.306340]   steps:[
[Training] [2023-03-15T08:47:13.309340]     gpt_train:[
[Training] [2023-03-15T08:47:13.313858]       training: gpt
[Training] [2023-03-15T08:47:13.316858]       loss_log_buffer: 500
[Training] [2023-03-15T08:47:13.318859]       optimizer: adamw
[Training] [2023-03-15T08:47:13.322371]       optimizer_params:[
[Training] [2023-03-15T08:47:13.325374]         lr: 1e-05
[Training] [2023-03-15T08:47:13.330878]         weight_decay: 0.01
[Training] [2023-03-15T08:47:13.334883]         beta1: 0.9
[Training] [2023-03-15T08:47:13.337884]         beta2: 0.96
[Training] [2023-03-15T08:47:13.340884]       ]
[Training] [2023-03-15T08:47:13.344202]       clip_grad_eps: 4
[Training] [2023-03-15T08:47:13.347208]       injectors:[
[Training] [2023-03-15T08:47:13.349205]         paired_to_mel:[
[Training] [2023-03-15T08:47:13.352721]           type: torch_mel_spectrogram
[Training] [2023-03-15T08:47:13.354721]           mel_norm_file: ./modules/tortoise-tts/tortoise/data/mel_norms.pth
[Training] [2023-03-15T08:47:13.357233]           in: wav
[Training] [2023-03-15T08:47:13.360739]           out: paired_mel
[Training] [2023-03-15T08:47:13.363751]         ]
[Training] [2023-03-15T08:47:13.365751]         paired_cond_to_mel:[
[Training] [2023-03-15T08:47:13.368752]           type: for_each
[Training] [2023-03-15T08:47:13.371257]           subtype: torch_mel_spectrogram
[Training] [2023-03-15T08:47:13.374271]           mel_norm_file: ./modules/tortoise-tts/tortoise/data/mel_norms.pth
[Training] [2023-03-15T08:47:13.376270]           in: conditioning
[Training] [2023-03-15T08:47:13.379271]           out: paired_conditioning_mel
[Training] [2023-03-15T08:47:13.383788]         ]
[Training] [2023-03-15T08:47:13.385788]         to_codes:[
[Training] [2023-03-15T08:47:13.389789]           type: discrete_token
[Training] [2023-03-15T08:47:13.392300]           in: paired_mel
[Training] [2023-03-15T08:47:13.395301]           out: paired_mel_codes
[Training] [2023-03-15T08:47:13.398302]           dvae_config: ./models/tortoise/train_diffusion_vocoder_22k_level.yml
[Training] [2023-03-15T08:47:13.400809]         ]
[Training] [2023-03-15T08:47:13.403820]         paired_fwd_text:[
[Training] [2023-03-15T08:47:13.406827]           type: generator
[Training] [2023-03-15T08:47:13.408821]           generator: gpt
[Training] [2023-03-15T08:47:13.412054]           in: ['paired_conditioning_mel', 'padded_text', 'text_lengths', 'paired_mel_codes', 'wav_lengths']
[Training] [2023-03-15T08:47:13.414054]           out: ['loss_text_ce', 'loss_mel_ce', 'logits']
[Training] [2023-03-15T08:47:13.417054]         ]
[Training] [2023-03-15T08:47:13.419052]       ]
[Training] [2023-03-15T08:47:13.421563]       losses:[
[Training] [2023-03-15T08:47:13.423564]         text_ce:[
[Training] [2023-03-15T08:47:13.426562]           type: direct
[Training] [2023-03-15T08:47:13.429566]           weight: 0.01
[Training] [2023-03-15T08:47:13.432086]           key: loss_text_ce
[Training] [2023-03-15T08:47:13.435084]         ]
[Training] [2023-03-15T08:47:13.437085]         mel_ce:[
[Training] [2023-03-15T08:47:13.439085]           type: direct
[Training] [2023-03-15T08:47:13.441598]           weight: 1
[Training] [2023-03-15T08:47:13.445598]           key: loss_mel_ce
[Training] [2023-03-15T08:47:13.447598]         ]
[Training] [2023-03-15T08:47:13.451103]       ]
[Training] [2023-03-15T08:47:13.453108]     ]
[Training] [2023-03-15T08:47:13.455615]   ]
[Training] [2023-03-15T08:47:13.458622]   networks:[
[Training] [2023-03-15T08:47:13.461126]     gpt:[
[Training] [2023-03-15T08:47:13.463137]       type: generator
[Training] [2023-03-15T08:47:13.466133]       which_model_G: unified_voice2
[Training] [2023-03-15T08:47:13.469133]       kwargs:[
[Training] [2023-03-15T08:47:13.471650]         layers: 30
[Training] [2023-03-15T08:47:13.473654]         model_dim: 1024
[Training] [2023-03-15T08:47:13.476656]         heads: 16
[Training] [2023-03-15T08:47:13.479651]         max_text_tokens: 402
[Training] [2023-03-15T08:47:13.483710]         max_mel_tokens: 604
[Training] [2023-03-15T08:47:13.486711]         max_conditioning_inputs: 2
[Training] [2023-03-15T08:47:13.488711]         mel_length_compression: 1024
[Training] [2023-03-15T08:47:13.491216]         number_text_tokens: 256
[Training] [2023-03-15T08:47:13.495222]         number_mel_codes: 8194
[Training] [2023-03-15T08:47:13.499223]         start_mel_token: 8192
[Training] [2023-03-15T08:47:13.501734]         stop_mel_token: 8193
[Training] [2023-03-15T08:47:13.503734]         start_text_token: 255
[Training] [2023-03-15T08:47:13.506734]         train_solo_embeddings: False
[Training] [2023-03-15T08:47:13.508736]         use_mel_codes_as_input: True
[Training] [2023-03-15T08:47:13.512247]         checkpointing: True
[Training] [2023-03-15T08:47:13.515254]         tortoise_compat: True
[Training] [2023-03-15T08:47:13.518253]       ]
[Training] [2023-03-15T08:47:13.520758]     ]
[Training] [2023-03-15T08:47:13.522771]   ]
[Training] [2023-03-15T08:47:13.525770]   path:[
[Training] [2023-03-15T08:47:13.528771]     strict_load: True
[Training] [2023-03-15T08:47:13.531801]     resume_state: ./training/21/finetune/training_state//3030.state
[Training] [2023-03-15T08:47:13.533801]     root: ./
[Training] [2023-03-15T08:47:13.536804]     experiments_root: ./training\21\finetune
[Training] [2023-03-15T08:47:13.538801]     models: ./training\21\finetune\models
[Training] [2023-03-15T08:47:13.541968]     training_state: ./training\21\finetune\training_state
[Training] [2023-03-15T08:47:13.545968]     log: ./training\21\finetune
[Training] [2023-03-15T08:47:13.548971]     val_images: ./training\21\finetune\val_images
[Training] [2023-03-15T08:47:13.551481]   ]
[Training] [2023-03-15T08:47:13.553482]   train:[
[Training] [2023-03-15T08:47:13.555989]     niter: 8080
[Training] [2023-03-15T08:47:13.558997]     warmup_iter: -1
[Training] [2023-03-15T08:47:13.562509]     mega_batch_factor: 4
[Training] [2023-03-15T08:47:13.564510]     val_freq: 20
[Training] [2023-03-15T08:47:13.567513]     ema_enabled: False
[Training] [2023-03-15T08:47:13.569511]     default_lr_scheme: MultiStepLR
[Training] [2023-03-15T08:47:13.572021]     gen_lr_steps: [8, 16, 36, 72, 100, 132, 200]
[Training] [2023-03-15T08:47:13.575023]     lr_gamma: 0.5
[Training] [2023-03-15T08:47:13.578022]   ]
[Training] [2023-03-15T08:47:13.580526]   eval:[
[Training] [2023-03-15T08:47:13.583538]     pure: True
[Training] [2023-03-15T08:47:13.585540]     output_state: gen
[Training] [2023-03-15T08:47:13.588540]   ]
[Training] [2023-03-15T08:47:13.590540]   logger:[
[Training] [2023-03-15T08:47:13.593701]     save_checkpoint_freq: 202
[Training] [2023-03-15T08:47:13.596698]     visuals: ['gen', 'mel']
[Training] [2023-03-15T08:47:13.598699]     visual_debug_rate: 202
[Training] [2023-03-15T08:47:13.601205]     is_mel_spectrogram: True
[Training] [2023-03-15T08:47:13.604212]   ]
[Training] [2023-03-15T08:47:13.607211]   is_train: True
[Training] [2023-03-15T08:47:13.609211]   dist: False
[Training] [2023-03-15T08:47:13.612726]
[Training] [2023-03-15T08:47:13.616732] 23-03-15 08:47:13.161 - INFO: Set model [gpt] to ./training\21\finetune\models\3030_gpt.pth
[Training] [2023-03-15T08:47:13.619731] 23-03-15 08:47:13.161 - INFO: Random seed: 6466
[Training] [2023-03-15T08:47:13.847229] Using BitsAndBytes optimizations
[Training] [2023-03-15T08:47:13.850736] Disabled distributed training.
[Training] [2023-03-15T08:47:13.854742] Traceback (most recent call last):
[Training] [2023-03-15T08:47:13.858258]   File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\src\train.py", line 68, in <module>
[Training] [2023-03-15T08:47:13.862782]     train(config_path, args.launcher)
[Training] [2023-03-15T08:47:13.864778]   File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\src\train.py", line 34, in train
[Training] [2023-03-15T08:47:13.868779]     trainer.init(config_path, opt, launcher, '')
[Training] [2023-03-15T08:47:13.871290]   File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\./modules/dlas\codes\train.py", line 128, in init
[Training] [2023-03-15T08:47:13.874302]     self.train_set, collate_fn = create_dataset(dataset_opt, return_collate=True)
[Training] [2023-03-15T08:47:13.879297]   File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\./modules/dlas/codes\data\__init__.py", line 107, in create_dataset
[Training] [2023-03-15T08:47:13.881808]     dataset = D(dataset_opt)
[Training] [2023-03-15T08:47:13.884812]   File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\./modules/dlas/codes\data\audio\paired_voice_audio_dataset.py", line 169, in __init__
[Training] [2023-03-15T08:47:13.886811]     self.tokenizer = VoiceBpeTokenizer(opt_get(hparams, ['tokenizer_vocab'], '../experiments/bpe_lowercase_asr_256.json'))
[Training] [2023-03-15T08:47:13.889811]   File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\./modules/dlas/codes\data\audio\voice_tokenizer.py", line 34, in __init__
[Training] [2023-03-15T08:47:13.892949]     self.tokenizer = Tokenizer.from_file(vocab_file)
[Training] [2023-03-15T08:47:13.894941] Exception: The system cannot find the file specified. (os error 2)
I get this error when trying to run training: ``` C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning>call .\venv\Scripts\activate.bat Running on local URL: http://127.0.0.1:7860 To create a public link, set `share=True` in `launch()`. Spawning process: train.bat ./training/21/train.yaml [Training] [2023-03-15T08:47:10.664831] [Training] [2023-03-15T08:47:10.667833] (venv) C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning>call .\venv\Scripts\activate.bat [Training] [2023-03-15T08:47:11.617364] NOTE: Redirects are currently not supported in Windows or MacOs. [Training] [2023-03-15T08:47:13.161874] 23-03-15 08:47:13.161 - INFO: name: 21 [Training] [2023-03-15T08:47:13.164876] model: extensibletrainer [Training] [2023-03-15T08:47:13.168878] scale: 1 [Training] [2023-03-15T08:47:13.170876] gpu_ids: [0] [Training] [2023-03-15T08:47:13.176181] start_step: 0 [Training] [2023-03-15T08:47:13.179181] checkpointing_enabled: True [Training] [2023-03-15T08:47:13.181693] fp16: False [Training] [2023-03-15T08:47:13.184695] bitsandbytes: True [Training] [2023-03-15T08:47:13.186697] gpus: 1 [Training] [2023-03-15T08:47:13.189695] datasets:[ [Training] [2023-03-15T08:47:13.193208] train:[ [Training] [2023-03-15T08:47:13.196209] name: training [Training] [2023-03-15T08:47:13.199211] n_workers: 2 [Training] [2023-03-15T08:47:13.202733] batch_size: 25 [Training] [2023-03-15T08:47:13.204732] mode: paired_voice_audio [Training] [2023-03-15T08:47:13.207738] path: ./training/21/train.txt [Training] [2023-03-15T08:47:13.211240] fetcher_mode: ['lj'] [Training] [2023-03-15T08:47:13.215246] phase: train [Training] [2023-03-15T08:47:13.218248] max_wav_length: 255995 [Training] [2023-03-15T08:47:13.220763] max_text_length: 200 [Training] [2023-03-15T08:47:13.223775] sample_rate: 22050 [Training] [2023-03-15T08:47:13.225776] load_conditioning: True [Training] [2023-03-15T08:47:13.229776] num_conditioning_candidates: 2 [Training] [2023-03-15T08:47:13.232300] conditioning_length: 44000 [Training] [2023-03-15T08:47:13.234303] use_bpe_tokenizer: True [Training] [2023-03-15T08:47:13.237299] tokenizer_vocab: ${tokenizer_json} [Training] [2023-03-15T08:47:13.239300] load_aligned_codes: False [Training] [2023-03-15T08:47:13.242822] data_type: img [Training] [2023-03-15T08:47:13.246819] ] [Training] [2023-03-15T08:47:13.250819] val:[ [Training] [2023-03-15T08:47:13.254264] name: validation [Training] [2023-03-15T08:47:13.256779] n_workers: 2 [Training] [2023-03-15T08:47:13.259777] batch_size: 6 [Training] [2023-03-15T08:47:13.263291] mode: paired_voice_audio [Training] [2023-03-15T08:47:13.266292] path: ./training/21/validation.txt [Training] [2023-03-15T08:47:13.268294] fetcher_mode: ['lj'] [Training] [2023-03-15T08:47:13.270795] phase: val [Training] [2023-03-15T08:47:13.273802] max_wav_length: 255995 [Training] [2023-03-15T08:47:13.276801] max_text_length: 200 [Training] [2023-03-15T08:47:13.279802] sample_rate: 22050 [Training] [2023-03-15T08:47:13.283318] load_conditioning: True [Training] [2023-03-15T08:47:13.285316] num_conditioning_candidates: 2 [Training] [2023-03-15T08:47:13.288315] conditioning_length: 44000 [Training] [2023-03-15T08:47:13.290821] use_bpe_tokenizer: True [Training] [2023-03-15T08:47:13.293831] tokenizer_vocab: ${tokenizer_json} [Training] [2023-03-15T08:47:13.296827] load_aligned_codes: False [Training] [2023-03-15T08:47:13.298829] data_type: img [Training] [2023-03-15T08:47:13.302341] ] [Training] [2023-03-15T08:47:13.304340] ] [Training] [2023-03-15T08:47:13.306340] steps:[ [Training] [2023-03-15T08:47:13.309340] gpt_train:[ [Training] [2023-03-15T08:47:13.313858] training: gpt [Training] [2023-03-15T08:47:13.316858] loss_log_buffer: 500 [Training] [2023-03-15T08:47:13.318859] optimizer: adamw [Training] [2023-03-15T08:47:13.322371] optimizer_params:[ [Training] [2023-03-15T08:47:13.325374] lr: 1e-05 [Training] [2023-03-15T08:47:13.330878] weight_decay: 0.01 [Training] [2023-03-15T08:47:13.334883] beta1: 0.9 [Training] [2023-03-15T08:47:13.337884] beta2: 0.96 [Training] [2023-03-15T08:47:13.340884] ] [Training] [2023-03-15T08:47:13.344202] clip_grad_eps: 4 [Training] [2023-03-15T08:47:13.347208] injectors:[ [Training] [2023-03-15T08:47:13.349205] paired_to_mel:[ [Training] [2023-03-15T08:47:13.352721] type: torch_mel_spectrogram [Training] [2023-03-15T08:47:13.354721] mel_norm_file: ./modules/tortoise-tts/tortoise/data/mel_norms.pth [Training] [2023-03-15T08:47:13.357233] in: wav [Training] [2023-03-15T08:47:13.360739] out: paired_mel [Training] [2023-03-15T08:47:13.363751] ] [Training] [2023-03-15T08:47:13.365751] paired_cond_to_mel:[ [Training] [2023-03-15T08:47:13.368752] type: for_each [Training] [2023-03-15T08:47:13.371257] subtype: torch_mel_spectrogram [Training] [2023-03-15T08:47:13.374271] mel_norm_file: ./modules/tortoise-tts/tortoise/data/mel_norms.pth [Training] [2023-03-15T08:47:13.376270] in: conditioning [Training] [2023-03-15T08:47:13.379271] out: paired_conditioning_mel [Training] [2023-03-15T08:47:13.383788] ] [Training] [2023-03-15T08:47:13.385788] to_codes:[ [Training] [2023-03-15T08:47:13.389789] type: discrete_token [Training] [2023-03-15T08:47:13.392300] in: paired_mel [Training] [2023-03-15T08:47:13.395301] out: paired_mel_codes [Training] [2023-03-15T08:47:13.398302] dvae_config: ./models/tortoise/train_diffusion_vocoder_22k_level.yml [Training] [2023-03-15T08:47:13.400809] ] [Training] [2023-03-15T08:47:13.403820] paired_fwd_text:[ [Training] [2023-03-15T08:47:13.406827] type: generator [Training] [2023-03-15T08:47:13.408821] generator: gpt [Training] [2023-03-15T08:47:13.412054] in: ['paired_conditioning_mel', 'padded_text', 'text_lengths', 'paired_mel_codes', 'wav_lengths'] [Training] [2023-03-15T08:47:13.414054] out: ['loss_text_ce', 'loss_mel_ce', 'logits'] [Training] [2023-03-15T08:47:13.417054] ] [Training] [2023-03-15T08:47:13.419052] ] [Training] [2023-03-15T08:47:13.421563] losses:[ [Training] [2023-03-15T08:47:13.423564] text_ce:[ [Training] [2023-03-15T08:47:13.426562] type: direct [Training] [2023-03-15T08:47:13.429566] weight: 0.01 [Training] [2023-03-15T08:47:13.432086] key: loss_text_ce [Training] [2023-03-15T08:47:13.435084] ] [Training] [2023-03-15T08:47:13.437085] mel_ce:[ [Training] [2023-03-15T08:47:13.439085] type: direct [Training] [2023-03-15T08:47:13.441598] weight: 1 [Training] [2023-03-15T08:47:13.445598] key: loss_mel_ce [Training] [2023-03-15T08:47:13.447598] ] [Training] [2023-03-15T08:47:13.451103] ] [Training] [2023-03-15T08:47:13.453108] ] [Training] [2023-03-15T08:47:13.455615] ] [Training] [2023-03-15T08:47:13.458622] networks:[ [Training] [2023-03-15T08:47:13.461126] gpt:[ [Training] [2023-03-15T08:47:13.463137] type: generator [Training] [2023-03-15T08:47:13.466133] which_model_G: unified_voice2 [Training] [2023-03-15T08:47:13.469133] kwargs:[ [Training] [2023-03-15T08:47:13.471650] layers: 30 [Training] [2023-03-15T08:47:13.473654] model_dim: 1024 [Training] [2023-03-15T08:47:13.476656] heads: 16 [Training] [2023-03-15T08:47:13.479651] max_text_tokens: 402 [Training] [2023-03-15T08:47:13.483710] max_mel_tokens: 604 [Training] [2023-03-15T08:47:13.486711] max_conditioning_inputs: 2 [Training] [2023-03-15T08:47:13.488711] mel_length_compression: 1024 [Training] [2023-03-15T08:47:13.491216] number_text_tokens: 256 [Training] [2023-03-15T08:47:13.495222] number_mel_codes: 8194 [Training] [2023-03-15T08:47:13.499223] start_mel_token: 8192 [Training] [2023-03-15T08:47:13.501734] stop_mel_token: 8193 [Training] [2023-03-15T08:47:13.503734] start_text_token: 255 [Training] [2023-03-15T08:47:13.506734] train_solo_embeddings: False [Training] [2023-03-15T08:47:13.508736] use_mel_codes_as_input: True [Training] [2023-03-15T08:47:13.512247] checkpointing: True [Training] [2023-03-15T08:47:13.515254] tortoise_compat: True [Training] [2023-03-15T08:47:13.518253] ] [Training] [2023-03-15T08:47:13.520758] ] [Training] [2023-03-15T08:47:13.522771] ] [Training] [2023-03-15T08:47:13.525770] path:[ [Training] [2023-03-15T08:47:13.528771] strict_load: True [Training] [2023-03-15T08:47:13.531801] resume_state: ./training/21/finetune/training_state//3030.state [Training] [2023-03-15T08:47:13.533801] root: ./ [Training] [2023-03-15T08:47:13.536804] experiments_root: ./training\21\finetune [Training] [2023-03-15T08:47:13.538801] models: ./training\21\finetune\models [Training] [2023-03-15T08:47:13.541968] training_state: ./training\21\finetune\training_state [Training] [2023-03-15T08:47:13.545968] log: ./training\21\finetune [Training] [2023-03-15T08:47:13.548971] val_images: ./training\21\finetune\val_images [Training] [2023-03-15T08:47:13.551481] ] [Training] [2023-03-15T08:47:13.553482] train:[ [Training] [2023-03-15T08:47:13.555989] niter: 8080 [Training] [2023-03-15T08:47:13.558997] warmup_iter: -1 [Training] [2023-03-15T08:47:13.562509] mega_batch_factor: 4 [Training] [2023-03-15T08:47:13.564510] val_freq: 20 [Training] [2023-03-15T08:47:13.567513] ema_enabled: False [Training] [2023-03-15T08:47:13.569511] default_lr_scheme: MultiStepLR [Training] [2023-03-15T08:47:13.572021] gen_lr_steps: [8, 16, 36, 72, 100, 132, 200] [Training] [2023-03-15T08:47:13.575023] lr_gamma: 0.5 [Training] [2023-03-15T08:47:13.578022] ] [Training] [2023-03-15T08:47:13.580526] eval:[ [Training] [2023-03-15T08:47:13.583538] pure: True [Training] [2023-03-15T08:47:13.585540] output_state: gen [Training] [2023-03-15T08:47:13.588540] ] [Training] [2023-03-15T08:47:13.590540] logger:[ [Training] [2023-03-15T08:47:13.593701] save_checkpoint_freq: 202 [Training] [2023-03-15T08:47:13.596698] visuals: ['gen', 'mel'] [Training] [2023-03-15T08:47:13.598699] visual_debug_rate: 202 [Training] [2023-03-15T08:47:13.601205] is_mel_spectrogram: True [Training] [2023-03-15T08:47:13.604212] ] [Training] [2023-03-15T08:47:13.607211] is_train: True [Training] [2023-03-15T08:47:13.609211] dist: False [Training] [2023-03-15T08:47:13.612726] [Training] [2023-03-15T08:47:13.616732] 23-03-15 08:47:13.161 - INFO: Set model [gpt] to ./training\21\finetune\models\3030_gpt.pth [Training] [2023-03-15T08:47:13.619731] 23-03-15 08:47:13.161 - INFO: Random seed: 6466 [Training] [2023-03-15T08:47:13.847229] Using BitsAndBytes optimizations [Training] [2023-03-15T08:47:13.850736] Disabled distributed training. [Training] [2023-03-15T08:47:13.854742] Traceback (most recent call last): [Training] [2023-03-15T08:47:13.858258] File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\src\train.py", line 68, in <module> [Training] [2023-03-15T08:47:13.862782] train(config_path, args.launcher) [Training] [2023-03-15T08:47:13.864778] File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\src\train.py", line 34, in train [Training] [2023-03-15T08:47:13.868779] trainer.init(config_path, opt, launcher, '') [Training] [2023-03-15T08:47:13.871290] File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\./modules/dlas\codes\train.py", line 128, in init [Training] [2023-03-15T08:47:13.874302] self.train_set, collate_fn = create_dataset(dataset_opt, return_collate=True) [Training] [2023-03-15T08:47:13.879297] File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\./modules/dlas/codes\data\__init__.py", line 107, in create_dataset [Training] [2023-03-15T08:47:13.881808] dataset = D(dataset_opt) [Training] [2023-03-15T08:47:13.884812] File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\./modules/dlas/codes\data\audio\paired_voice_audio_dataset.py", line 169, in __init__ [Training] [2023-03-15T08:47:13.886811] self.tokenizer = VoiceBpeTokenizer(opt_get(hparams, ['tokenizer_vocab'], '../experiments/bpe_lowercase_asr_256.json')) [Training] [2023-03-15T08:47:13.889811] File "C:\Users\LXC PC\Desktop\mrqtts\ai-voice-cloning\./modules/dlas/codes\data\audio\voice_tokenizer.py", line 34, in __init__ [Training] [2023-03-15T08:47:13.892949] self.tokenizer = Tokenizer.from_file(vocab_file) [Training] [2023-03-15T08:47:13.894941] Exception: The system cannot find the file specified. (os error 2) ```
Owner

Commit 5e4f6808ce should fix it. I tested it after manually setting the tokenizer setting, but not when it's not defined in the setting (defaulting to null).

Commit 5e4f6808cee03732b9610521d734d063a802f731 should fix it. I tested it after manually setting the tokenizer setting, but not when it's not defined in the setting (defaulting to null).
mrq closed this issue 2023-03-15 00:56:02 +00:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#136
No description provided.