Fork 0
Commit Graph

382 Commits (master)

Author SHA1 Message Date
mrq ef75dba995 I hate commas make tuples 2023-03-09 02:43:05 +07:00
mrq f795dd5c20 you might be wondering why so many small commits instead of rolling the HEAD back one to just combine them, i don't want to force push and roll back the paperspace i'm testing in 2023-03-09 02:31:32 +07:00
mrq 51339671ec typo 2023-03-09 02:29:08 +07:00
mrq 1b18b3e335 forgot to save the simplified training input json first before touching any of the settings that dump to the yaml 2023-03-09 02:27:20 +07:00
mrq 221ac38b32 forgot to update to finetune subdir 2023-03-09 02:25:32 +07:00
mrq 0e80e311b0 added VRAM validation for a given batch:gradient accumulation size ratio (based emprically off of 6GiB, 16GiB, and 16x2GiB, would be nice to have more data on what's safe) 2023-03-09 02:08:06 +07:00
mrq ef7b957fff oops 2023-03-09 00:53:00 +07:00
mrq b0baa1909a forgot template 2023-03-09 00:32:35 +07:00
mrq 3f321fe664 big cleanup to make my life easier when i add more parameters 2023-03-09 00:26:47 +07:00
mrq 0ab091e7ff oops 2023-03-08 16:09:29 +07:00
mrq 40e8d0774e share if you 2023-03-08 15:59:16 +07:00
mrq d58b67004a colab notebook uses venv and normal scripts to keep it on parity with a local install (and it literally just works stop creating issues for someething inconsistent with known solutions) 2023-03-08 15:51:13 +07:00
mrq 34dcb845b5 actually make using adamw_zero optimizer for multi-gpus work 2023-03-08 15:31:33 +07:00
mrq 8494628f3c normalize validation batch size because i oom'd without it getting scaled 2023-03-08 05:27:20 +07:00
mrq d7e75a51cf I forgot about the changelog and never kept up with it, so I'll just not use a changelog 2023-03-08 05:14:50 +07:00
mrq ff07f707cb disable validation if validation dataset not found, clamp validation batch size to validation dataset size instead of simply reusing batch size, switch to adamw_zero optimizier when training with multi-gpus (because the yaml comment said to and I think it might be why I'm absolutely having garbage luck training this japanese dataset) 2023-03-08 04:47:05 +07:00
mrq f1788a5639 lazy wrap around the voicefixer block because sometimes it just an heros itself despite having a specific block to load it beforehand 2023-03-08 04:12:22 +07:00
mrq 83b5125854 fixed notebooks, provided paperspace notebook 2023-03-08 03:29:12 +07:00
mrq b4098dca73 made validation working (will document later) 2023-03-08 02:58:00 +07:00
mrq a7e0dc9127 oops 2023-03-08 00:51:51 +07:00
mrq e862169e7f set validation to save rate and validation file if exists (need to test later) 2023-03-07 20:38:31 +07:00
mrq fe8bf7a9d1 added helper script to cull short enough lines from training set as a validation set (if it yields good results doing validation during training, i'll add it to the web ui) 2023-03-07 20:16:49 +07:00
mrq 7f89e8058a fixed update checker for dlas+tortoise-tts 2023-03-07 19:33:56 +07:00
mrq 6d7e143f53 added override for large training plots 2023-03-07 19:29:09 +07:00
mrq 3718e9d0fb set NaN alarm to show the iteration it happened it 2023-03-07 19:22:11 +07:00
mrq c27ee3ce95 added update checking for dlas and tortoise-tts, caching voices (for a given model and voice name) so random latents will remain the same 2023-03-07 17:04:45 +07:00
mrq 166d491a98 fixes 2023-03-07 13:40:41 +07:00
mrq df5ba634c0 brain dead 2023-03-07 05:43:26 +07:00
mrq 2726d98ee1 fried my brain trying to nail out bugs involving using solely ar model=auto 2023-03-07 05:35:21 +07:00
mrq d7a5ad9fd9 cleaned up some model loading logic, added 'auto' mode for AR model (deduced by current voice) 2023-03-07 04:34:39 +07:00
mrq 3899f9b4e3 added (yet another) experimental voice latent calculation mode (when chunk size is 0 and theres a dataset generated, itll leverage it by padding to a common size then computing them, should help avoid splitting mid-phoneme) 2023-03-07 03:55:35 +07:00
mrq 5063728bb0 brain worms and headaches 2023-03-07 03:01:02 +07:00
mrq 0f31c34120 download dvae.pth for the people who managed to somehow put the web UI into a state where it never initializes TTS at all somehow 2023-03-07 02:47:10 +07:00
mrq 0f0b394445 moved (actually not working) setting to use BigVGAN to a dropdown to select between vocoders (for when slotting in future ones), and ability to load a new vocoder while TTS is loaded 2023-03-07 02:45:22 +07:00
mrq e731b9ba84 reworked generating metadata to embed, should now store overrided settings 2023-03-06 23:07:16 +07:00
mrq 7798767fc6 added settings editing (will add a guide on what to do later, and an example) 2023-03-06 21:48:34 +07:00
mrq 119ac50c58 forgot to re-append the existing transcription when skipping existing (have to go back again and do the first 10% of my giant dataset 2023-03-06 16:50:55 +07:00
mrq da0af4c498 one more 2023-03-06 16:47:34 +07:00
mrq 11a1f6a00e forgot to reorder the dependency install because whisperx needs to be installed before DLAS 2023-03-06 16:43:17 +07:00
mrq 12c51b6057 Im not too sure if manually invoking gc actually closes all the open files from whisperx (or ROCm), but it seems to have gone away longside setting 'ulimit -Sn' to half the output of 'ulimit -Hn' 2023-03-06 16:39:37 +07:00
mrq 999878d9c6 and it turned out I wasn't even using the aligned segments, kmsing now that I have to *redo* my dataset again 2023-03-06 11:01:33 +07:00
mrq 14779a5020 Added option to skip transcribing if it exists in the output text file, because apparently whisperx will throw a "max files opened" error when using ROCm because it does not close some file descriptors if you're batch-transcribing or something, so poor little me, who's retranscribing his japanese dataset for the 305823042th time woke up to it partially done i am so mad I have to wait another few hours for it to continue when I was hoping to wake up to it done 2023-03-06 10:47:06 +07:00
mrq 0e3bbc55f8 added api_name for generation, added whisperx backend, relocated use whispercpp option to whisper backend list 2023-03-06 05:21:33 +07:00
mrq 788a957f79 stretch loss plot to target iteration just so its not so misleading with the scale 2023-03-06 00:44:29 +07:00
mrq 5be14abc21 UI cleanup, actually fix syncing the epoch counter (i hope), setting auto-suggest voice chunk size whatever to 0 will just split based on the average duration length, signal when a NaN info value is detected (there's some safeties in the training, but it will inevitably fuck the model) 2023-03-05 23:55:27 +07:00
mrq 287738a338 (should) fix reported epoch metric desyncing from defacto metric, fixed finding next milestone from wrong sign because of 2AM brain 2023-03-05 20:42:45 +07:00
mrq 206a14fdbe brianworms 2023-03-05 20:30:27 +07:00
mrq b82961ba8a typo 2023-03-05 20:13:39 +07:00
mrq b2e89d8da3 oops 2023-03-05 19:58:15 +07:00
mrq 8094401a6d print in e-notation for LR 2023-03-05 19:48:24 +07:00