1
0
Fork 0
Commit Graph

372 Commits (master)
 

Author SHA1 Message Date
mrq 40e8d0774e share if you 2023-03-08 15:59:16 +07:00
mrq d58b67004a colab notebook uses venv and normal scripts to keep it on parity with a local install (and it literally just works stop creating issues for someething inconsistent with known solutions) 2023-03-08 15:51:13 +07:00
mrq 34dcb845b5 actually make using adamw_zero optimizer for multi-gpus work 2023-03-08 15:31:33 +07:00
mrq 8494628f3c normalize validation batch size because i oom'd without it getting scaled 2023-03-08 05:27:20 +07:00
mrq d7e75a51cf I forgot about the changelog and never kept up with it, so I'll just not use a changelog 2023-03-08 05:14:50 +07:00
mrq ff07f707cb disable validation if validation dataset not found, clamp validation batch size to validation dataset size instead of simply reusing batch size, switch to adamw_zero optimizier when training with multi-gpus (because the yaml comment said to and I think it might be why I'm absolutely having garbage luck training this japanese dataset) 2023-03-08 04:47:05 +07:00
mrq f1788a5639 lazy wrap around the voicefixer block because sometimes it just an heros itself despite having a specific block to load it beforehand 2023-03-08 04:12:22 +07:00
mrq 83b5125854 fixed notebooks, provided paperspace notebook 2023-03-08 03:29:12 +07:00
mrq b4098dca73 made validation working (will document later) 2023-03-08 02:58:00 +07:00
mrq a7e0dc9127 oops 2023-03-08 00:51:51 +07:00
mrq e862169e7f set validation to save rate and validation file if exists (need to test later) 2023-03-07 20:38:31 +07:00
mrq fe8bf7a9d1 added helper script to cull short enough lines from training set as a validation set (if it yields good results doing validation during training, i'll add it to the web ui) 2023-03-07 20:16:49 +07:00
mrq 7f89e8058a fixed update checker for dlas+tortoise-tts 2023-03-07 19:33:56 +07:00
mrq 6d7e143f53 added override for large training plots 2023-03-07 19:29:09 +07:00
mrq 3718e9d0fb set NaN alarm to show the iteration it happened it 2023-03-07 19:22:11 +07:00
mrq c27ee3ce95 added update checking for dlas and tortoise-tts, caching voices (for a given model and voice name) so random latents will remain the same 2023-03-07 17:04:45 +07:00
mrq 166d491a98 fixes 2023-03-07 13:40:41 +07:00
mrq df5ba634c0 brain dead 2023-03-07 05:43:26 +07:00
mrq 2726d98ee1 fried my brain trying to nail out bugs involving using solely ar model=auto 2023-03-07 05:35:21 +07:00
mrq d7a5ad9fd9 cleaned up some model loading logic, added 'auto' mode for AR model (deduced by current voice) 2023-03-07 04:34:39 +07:00
mrq 3899f9b4e3 added (yet another) experimental voice latent calculation mode (when chunk size is 0 and theres a dataset generated, itll leverage it by padding to a common size then computing them, should help avoid splitting mid-phoneme) 2023-03-07 03:55:35 +07:00
mrq 5063728bb0 brain worms and headaches 2023-03-07 03:01:02 +07:00
mrq 0f31c34120 download dvae.pth for the people who managed to somehow put the web UI into a state where it never initializes TTS at all somehow 2023-03-07 02:47:10 +07:00
mrq 0f0b394445 moved (actually not working) setting to use BigVGAN to a dropdown to select between vocoders (for when slotting in future ones), and ability to load a new vocoder while TTS is loaded 2023-03-07 02:45:22 +07:00
mrq e731b9ba84 reworked generating metadata to embed, should now store overrided settings 2023-03-06 23:07:16 +07:00
mrq 7798767fc6 added settings editing (will add a guide on what to do later, and an example) 2023-03-06 21:48:34 +07:00
mrq 119ac50c58 forgot to re-append the existing transcription when skipping existing (have to go back again and do the first 10% of my giant dataset 2023-03-06 16:50:55 +07:00
mrq da0af4c498 one more 2023-03-06 16:47:34 +07:00
mrq 11a1f6a00e forgot to reorder the dependency install because whisperx needs to be installed before DLAS 2023-03-06 16:43:17 +07:00
mrq 12c51b6057 Im not too sure if manually invoking gc actually closes all the open files from whisperx (or ROCm), but it seems to have gone away longside setting 'ulimit -Sn' to half the output of 'ulimit -Hn' 2023-03-06 16:39:37 +07:00
mrq 999878d9c6 and it turned out I wasn't even using the aligned segments, kmsing now that I have to *redo* my dataset again 2023-03-06 11:01:33 +07:00
mrq 14779a5020 Added option to skip transcribing if it exists in the output text file, because apparently whisperx will throw a "max files opened" error when using ROCm because it does not close some file descriptors if you're batch-transcribing or something, so poor little me, who's retranscribing his japanese dataset for the 305823042th time woke up to it partially done i am so mad I have to wait another few hours for it to continue when I was hoping to wake up to it done 2023-03-06 10:47:06 +07:00
mrq 0e3bbc55f8 added api_name for generation, added whisperx backend, relocated use whispercpp option to whisper backend list 2023-03-06 05:21:33 +07:00
mrq 788a957f79 stretch loss plot to target iteration just so its not so misleading with the scale 2023-03-06 00:44:29 +07:00
mrq 5be14abc21 UI cleanup, actually fix syncing the epoch counter (i hope), setting auto-suggest voice chunk size whatever to 0 will just split based on the average duration length, signal when a NaN info value is detected (there's some safeties in the training, but it will inevitably fuck the model) 2023-03-05 23:55:27 +07:00
mrq 287738a338 (should) fix reported epoch metric desyncing from defacto metric, fixed finding next milestone from wrong sign because of 2AM brain 2023-03-05 20:42:45 +07:00
mrq 206a14fdbe brianworms 2023-03-05 20:30:27 +07:00
mrq b82961ba8a typo 2023-03-05 20:13:39 +07:00
mrq b2e89d8da3 oops 2023-03-05 19:58:15 +07:00
mrq 8094401a6d print in e-notation for LR 2023-03-05 19:48:24 +07:00
mrq 8b9c9e1bbf remove redundant stats, add showing LR 2023-03-05 18:53:12 +07:00
mrq 0231550287 forgot to remove a debug print 2023-03-05 18:27:16 +07:00
mrq d97639e138 whispercpp actually works now (language loading was weird, slicing needed to divide time by 100), transcribing audio checks for silence and discards them 2023-03-05 17:54:36 +07:00
mrq b8a620e8d7 actually accumulate derivatives when estimating milestones and final loss by using half of the log 2023-03-05 14:39:24 +07:00
mrq 35225a35da oops v2 2023-03-05 14:19:41 +07:00
mrq b5e9899bbf 5 hour sleep brained 2023-03-05 13:37:05 +07:00
mrq cd8702ab0d oops 2023-03-05 13:24:07 +07:00
mrq d312019d05 reordered things so it uses fresh data and not last-updated data 2023-03-05 07:37:27 +07:00
mrq ce3866d0cd added '''estimating''' iterations until milestones (lr=[1, 0.5, 0.1] and final lr, very, very inaccurate because it uses instantaneous delta lr, I'll need to do a riemann sum later 2023-03-05 06:45:07 +07:00
mrq 1316331be3 forgot to try and have it try and auto-detect for openai/whisper when no language is specified 2023-03-05 05:22:35 +07:00