1
0
Fork 0
Commit Graph

246 Commits (fe03ae5839d838121f3a8a6aaca14f8a9ed2ed0a)
 

Author SHA1 Message Date
mrq fe03ae5839 fixes 2023-03-14 17:42:42 +07:00
mrq 9d2c7fb942 cleanup 2023-03-14 16:23:29 +07:00
mrq 65fe304267 fixed broken graph displaying 2023-03-14 16:04:56 +07:00
mrq 7b16b3e88a ;) 2023-03-14 15:48:09 +07:00
mrq c85e32ff53 (: 2023-03-14 14:08:35 +07:00
mrq 54036fd780 :) 2023-03-14 05:02:14 +07:00
mrq 92a05d3c4c added PYTHONUTF8 to start/train bats 2023-03-14 02:29:11 +07:00
mrq dadb1fca6b multichannel audio now report correct duration (surprised it took this long for me to source multichannel audio) 2023-03-13 21:24:51 +07:00
mrq 32d968a8cd (disabled by default until i validate it working) added additional transcription text normalization (something else I'm experimenting with requires it) 2023-03-13 19:07:23 +07:00
mrq 66ac8ba766 added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creation 2023-03-13 18:51:53 +07:00
mrq ee1b048d07 when creating the train/validatio datasets, use segments if the main audio's duration is too long, and slice to make the segments if they don't exist 2023-03-13 04:26:00 +07:00
mrq 0cf9db5e69 oops 2023-03-13 01:33:45 +07:00
mrq 050bcefd73 resample to 22.5K when creating training inputs (to avoid redundant downsampling when loaded for training, even though most of my inputs are already at 22.5K), generalized resampler function to cache and reuse them, do not unload whisper when done transcribing since it gets unloaded anyways for any other non-transcription task 2023-03-13 01:20:55 +07:00
mrq 7c9c0dc584 forgot to clean up debug prints 2023-03-13 00:44:37 +07:00
mrq 239c984850 move validating audio to creating the text files instead, consider audio longer than 11 seconds invalid, consider text lengths over 200 invalid 2023-03-12 23:39:00 +07:00
mrq 51ddc205cd update submodules 2023-03-12 18:14:36 +07:00
mrq ccbf2e6aff blame mrq/ai-voice-cloning#122 2023-03-12 17:51:52 +07:00
mrq 9238df0b03 fixed last generation settings not actually load because brain worms 2023-03-12 15:49:50 +07:00
mrq 9594a960b0 Disable loss ETA for now until I fix it 2023-03-12 15:39:54 +07:00
mrq 51f6c347fe Merge pull request 'updated several default configurations to not cause null/empty errors. also default samples/iterations to 16-30 ultra fast which is typically suggested.' (#122) from zim33/ai-voice-cloning:save_more_user_config into master
Reviewed-on: mrq/ai-voice-cloning#122
2023-03-12 15:38:34 +07:00
mrq be8b290a1a Merge branch 'master' into save_more_user_config 2023-03-12 15:38:08 +07:00
mrq 296129ba9c output fixes, I'm not sure why ETA wasn't working but it works in testing 2023-03-12 15:17:07 +07:00
mrq 098d7ad635 uh I don't remember, small things 2023-03-12 14:47:48 +07:00
tigi6346 233baa4e45 updated several default configurations to not cause null/empty errors. also default samples/iterations to 16-30 ultra fast which is typically suggested. 2023-03-12 16:08:02 +07:00
mrq 1ac278e885 Merge pull request 'keep_training' (#118) from zim33/ai-voice-cloning:keep_training into master
Reviewed-on: mrq/ai-voice-cloning#118
2023-03-12 06:47:01 +07:00
tigi6346 29b3d1ae1d Fixed Keep X Previous States 2023-03-12 08:01:08 +07:00
tigi6346 9e320a34c8 Fixed Keep X Previous States 2023-03-12 08:00:03 +07:00
mrq 8ed09f9b87 Merge pull request 'Catch OOM and run whisper on cpu automatically.' (#117) from zim33/ai-voice-cloning:vram into master
Reviewed-on: mrq/ai-voice-cloning#117
2023-03-12 05:09:53 +07:00
tigi6346 61500107ab Catch OOM and run whisper on cpu automatically. 2023-03-12 06:48:28 +07:00
mrq ede9804b76 added option to trim silence using torchaudio's VAD 2023-03-11 21:41:35 +07:00
mrq dea2fa9caf added fields to offset start/end slices to apply in bulk when slicing 2023-03-11 21:34:29 +07:00
mrq 89bb3d4419 rename transcribe button since it does more than transcribe 2023-03-11 21:18:04 +07:00
mrq 382a3e4104 rely on the whisper.json for handling a lot more things 2023-03-11 21:17:11 +07:00
mrq 9b376c381f brain worm 2023-03-11 18:14:32 +07:00
mrq 94551fb9ac split slicing dataset routine so it can be done after the fact 2023-03-11 17:27:01 +07:00
mrq e3fdb79b49 rocm5.2 works for me desu so I bumped it back up 2023-03-11 17:02:56 +07:00
mrq e680d84a13 removed the hotfix pip installs that whisperx requires now that whisperx is gone 2023-03-11 16:55:19 +07:00
mrq cf41492f76 fall back to normal behavior if theres actually no audiofiles loaded from the dataset when using it for computing latents 2023-03-11 16:46:03 +07:00
mrq b90c164778 Farewell, parasite 2023-03-11 16:40:34 +07:00
mrq 2424c455cb added option to not slice audio when transcribing, added option to prepare validation dataset on audio duration, added a warning if youre using whisperx and you're slicing audio 2023-03-11 16:32:35 +07:00
tigi6346 dcdcf8516c master (#112)
Fixes Gradio bugging out when attempting to load a missing train.json.

Reviewed-on: mrq/ai-voice-cloning#112
Co-authored-by: tigi6346 <tigi6346@noreply.localhost>
Co-committed-by: tigi6346 <tigi6346@noreply.localhost>
2023-03-11 03:28:04 +07:00
mrq 008a1f5f8f simplified spawning the training process by having it spawn the distributed training processes in the train.py script, so it should work on Windows too 2023-03-11 01:37:00 +07:00
mrq 2feb6da0c0 cleanups and fixes, fix DLAS throwing errors from '''too short of sound files''' by just culling them during transcription 2023-03-11 01:19:49 +07:00
mrq 7f2da0f5fb rewrote how AIVC gets training metrics (need to clean up later) 2023-03-10 22:35:32 +07:00
mrq df0edacc60 fix the cleanup actually only doing 2 despite requesting more than 2, surprised no one has pointed it out 2023-03-10 14:04:07 +07:00
mrq 8e890d3023 forgot to fix reset settings to use the new arg-agnostic way 2023-03-10 13:49:39 +07:00
mrq d250e0ec17 brain fried 2023-03-10 04:27:34 +07:00
mrq 0b364b590e maybe don't --force-reinstall to try and force downgrading, it just forces everything to uninstall then reinstall 2023-03-10 04:22:47 +07:00
mrq c231d842aa make dependencies after the one in this repo force reinstall to downgrade, i hope, I hav eother things to do than validate this works 2023-03-10 03:53:21 +07:00
mrq c92b006129 I really hate YAML 2023-03-10 03:48:46 +07:00