Commit Graph

202 Commits

Author SHA1 Message Date
mrq
8740cdefc6 added initial support for languages (still testing, marked as model version 3), added experimental 'context extend by limiting the resp context' (untested) 2023-10-11 20:38:40 -05:00
mrq
6045cbce94 added experimental option to append utterances for training target (emphasis on experimental) 2023-10-11 17:32:45 -05:00
mrq
893a610fad cleanup, use deepspeed inferencing pathway if requested 2023-10-09 15:24:04 -05:00
mrq
63cc9cf37a added compat flags for torchscale because the maintainer for torchscale broke compat for existing models 2023-10-05 16:39:46 -05:00
mrq
153f8b293c added min-x and min-y arguments to plot.py, helper script to download from my existing checkpoint 2023-10-04 19:41:37 -05:00
mrq
d12877ee09 added option to set probability of selecting the AR during training under a monolithic AR+NAR, added some more to-dos while I have them in mind 2023-10-02 16:52:42 -05:00
mrq
c0b25541e3 restructured some things with the model to remove dead weights 2023-09-20 19:10:59 -05:00
mrq
d07c63b9d8 unified more things with training the AR+NAR monolothic model 2023-09-12 15:54:41 -05:00
mrq
40ef34e1ca this embedding class definitely works, and migrating from the previous embedding weights seems to work. 2023-09-11 14:13:42 -05:00
mrq
671dca88ee throw error when no reference audio is provided in the web UI because someone keeps doing that in the HF space 2023-09-10 15:50:50 -05:00
mrq
c74fe2f718 tweaks to web UI 2023-09-09 22:27:20 -05:00
mrq
f69aad9c65 some day I'll get it right 2023-09-08 15:36:26 -05:00
mrq
8837bc34d7 added option to specify parameters to freeze per-model in YAML (because I need to see about committing atrocities with convering an AR into an AR+NAR) 2023-09-07 18:19:51 -05:00
mrq
c47fc3274e added backwards compat flag 2023-09-07 17:12:17 -05:00
mrq
e7a67410d1 oops 2023-09-07 09:14:03 -05:00
mrq
100ca6b7d0 added option to use SGD optimizer through the YAML, added option to pass in additional optimizer parameters through the YAML, added experimental unified AR+NAR model (does not seem fruitful in testing) 2023-09-06 18:58:35 -05:00
mrq
451726fdd5 added ability to disable activation checkpointing through the YAML (it is very VRAM intensive at double layer size) 2023-09-05 15:38:21 -05:00
mrq
2f9cd0842f merged dedicated interleaved AR code with the normal AR code 2023-09-03 22:46:08 -05:00
mrq
8a6c203277 added per-speaker samplers 2023-09-03 21:27:13 -05:00
mrq
57db3ccfa8 shuffled VALL-E continuous as a task tts-c instead, logic fixes for it 2023-09-02 12:23:40 -05:00
mrq
2f06166ddd cleanups 2023-09-01 21:33:51 -05:00
mrq
e40c0d34a0 somewhat got recurrent forward working (it's as accurate as chunkwise forward: it's not accurate at all), added option to use AMP instead of blanket setting the weight's dtype 2023-09-01 20:58:29 -05:00
mrq
2bc2d08b09 (need to verify) added modifying model size and config bool to align with VALL-E continuous' methodology 2023-09-01 17:19:34 -05:00
mrq
87c4bfedba added ability to mark models as disabled for training, and hotloading them for eval/validation (useful if training only one model, or training a model per GPU) 2023-08-27 12:26:12 -05:00
mrq
165a1154e0 Undo naive=False test flag, this shouldn't have made its way in 2023-08-26 22:00:43 -05:00
mrq
78378ed1ce overhauled dataloading code to be marginally faster, mostly cleaned up, and can leverage a metadata json to help things out 2023-08-26 19:53:23 -05:00
mrq
00ad4af651 updated draconian requirement for espeak-ng to be installed and the env var set to the dll for Windows 2023-08-24 14:57:01 -05:00
mrq
4585824cd3 tweaks, including exporting on save/quit 2023-08-23 16:43:03 -05:00
mrq
d106598403 do not utilize diskcache if a config yaml is not loaded 2023-08-23 11:02:15 -05:00
mrq
7b1b82e0e5 inferencing cleanup 2023-08-20 21:36:02 -05:00
mrq
736c077282 ops 2023-08-20 13:42:18 -05:00
mrq
2d1a9f10c0 nightmare of spaghetti that might break compat; mechanism to increase RVQ bins of an existing model without retraining, keeps sampled proms/resps at max RVQ level and trim off excess levels according to what model receives them, some other things I already forgot (I really hope no one else has weights being baked right now) 2023-08-19 15:06:33 -05:00
mrq
f7f6d3bf6d validated that SpeechX tasks cse and nse works, added a method to test each task by invoking python3 -m vall_e.data --action=tasks --tasks='sr,se,cse,nse' 2023-08-19 09:50:07 -05:00
mrq
8f42c578c9 setting up for allowing training for a partial amount of the speechx tasks (do NOT try this at home yet without a proper model, as performance is predecated on having a solid base vall-e model for the tasks 2023-08-19 00:16:08 -05:00
mrq
ae9d38aa31 forgot to have it pull from specified noise to the hdf5 dataset 2023-08-18 23:57:07 -05:00
mrq
77292c42f9 tested the training preparation for tasks ns, sr, and tse (I don't expect it to go well with only 2 RVQ bins) 2023-08-18 23:55:40 -05:00
mrq
bbb0563b3d pseudocode polyfill stub some other flavor of working on adding the tasks 2023-08-18 22:22:13 -05:00
mrq
fb4e816823 oops 2023-08-18 21:11:19 -05:00
mrq
2a71486cb6 preparing for SpeechX extensions 2023-08-18 20:58:07 -05:00
mrq
ced31fd9b7 removed the sampler as it's very misleading 2023-08-18 14:47:48 -05:00
mrq
ee58db746f actually make the evaluation dataset shuffled for sample_type=speaker 2023-08-17 15:04:45 -05:00
mrq
d7152fc7b9 added pruning of old checkpoints if specified (cfg.trainer.keep_last_checkpoints) 2023-08-16 20:12:12 -05:00
mrq
44c08d828e added sample_type that samples from speakers to truly balance an epoch by speakers rather than the entire dataset and a sampler that tries to balance by speakers 2023-08-16 19:39:21 -05:00
mrq
1e3e1d9315 tweaks 2023-08-15 21:58:16 -05:00
mrq
13571380be made exporter make more sense 2023-08-13 22:56:28 -05:00
mrq
d7deaf6def distributed training works now (hopefully) 2023-08-13 22:07:45 -05:00
mrq
d89568a96e some fixes for the local framework 2023-08-05 03:22:15 +00:00
mrq
5970f254e3 some fixes for the local framework 2023-08-05 02:17:30 +00:00
mrq
608c1970eb ops 2023-08-03 20:36:19 -05:00
mrq
c85101403f big cleanup 2023-08-03 20:26:36 -05:00
mrq
f6597e2dfe adjustments 2023-08-02 18:36:26 -05:00
mrq
bf8cedc9dd Rewrite init 2023-08-02 21:53:35 +00:00