1
0
Fork 0
Commit Graph

271 Commits (1b72d0bba0622e0ba81befb1484116ba15e95589)
 

Author SHA1 Message Date
mrq 1b72d0bba0 forgot to separate phonemes by spaces for [redacted] 2023-03-17 02:08:07 +07:00
mrq d4c50967a6 cleaned up some prepare dataset code 2023-03-17 01:24:02 +07:00
mrq 0b62ccc112 setup bnb on windows as needed 2023-03-16 20:48:48 +07:00
mrq c4edfb7d5e unbump rocm5.4.2 because it does not work for me desu 2023-03-16 15:33:23 +07:00
mrq 520fbcd163 bumped torch up (CUDA: 11.8, ROCm, 5.4.2) 2023-03-16 15:09:11 +07:00
mrq 1a8c5de517 unk hunting 2023-03-16 14:59:12 +07:00
mrq 46ff3c476a fixes v2 2023-03-16 14:41:40 +07:00
mrq 0408d44602 fixed reload tts being broken due to being as untouched as I am 2023-03-16 14:24:44 +07:00
mrq aeb904a800 yammed 2023-03-16 14:23:47 +07:00
mrq f9154c4db1 fixes 2023-03-16 14:19:56 +07:00
mrq 54f2fc792a ops 2023-03-16 05:14:15 +07:00
mrq 0a7d6f02a7 ops 2023-03-16 04:54:17 +07:00
mrq 4ac43fa3a3 I forgot I undid the thing in DLAS 2023-03-16 04:51:35 +07:00
mrq da4f92681e oops 2023-03-16 04:35:12 +07:00
mrq ee8270bdfb preparations for training an IPA-based finetune 2023-03-16 04:25:33 +07:00
mrq 7b80f7a42f fixed not cleaning up states while training (oops) 2023-03-15 02:48:05 +07:00
mrq b31bf1206e oops 2023-03-15 01:51:04 +07:00
mrq d752a22331 print a warning if automatically deduced batch size returns 1 2023-03-15 01:20:15 +07:00
mrq f6d34e1dd3 and maybe I should have actually tested with ./models/tokenizers/ made 2023-03-15 01:09:20 +07:00
mrq 5e4f6808ce I guess I didn't test on a blank-ish slate 2023-03-15 00:54:27 +07:00
mrq 363d0b09b1 added options to pick tokenizer json and diffusion model (so I don't have to add it in later when I get bored and add in diffusion training) 2023-03-15 00:37:38 +07:00
mrq 07b684c4e7 removed redundant training data (they exist within tortoise itself anyways), added utility: view tokenized text 2023-03-14 21:51:27 +07:00
mrq 469dd47a44 fixes #131 2023-03-14 18:58:03 +07:00
mrq 84b7383428 fixes #134 2023-03-14 18:52:56 +07:00
mrq 4b952ea52a fixes #132 2023-03-14 18:46:20 +07:00
mrq fe03ae5839 fixes 2023-03-14 17:42:42 +07:00
mrq 9d2c7fb942 cleanup 2023-03-14 16:23:29 +07:00
mrq 65fe304267 fixed broken graph displaying 2023-03-14 16:04:56 +07:00
mrq 7b16b3e88a ;) 2023-03-14 15:48:09 +07:00
mrq c85e32ff53 (: 2023-03-14 14:08:35 +07:00
mrq 54036fd780 :) 2023-03-14 05:02:14 +07:00
mrq 92a05d3c4c added PYTHONUTF8 to start/train bats 2023-03-14 02:29:11 +07:00
mrq dadb1fca6b multichannel audio now report correct duration (surprised it took this long for me to source multichannel audio) 2023-03-13 21:24:51 +07:00
mrq 32d968a8cd (disabled by default until i validate it working) added additional transcription text normalization (something else I'm experimenting with requires it) 2023-03-13 19:07:23 +07:00
mrq 66ac8ba766 added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creation 2023-03-13 18:51:53 +07:00
mrq ee1b048d07 when creating the train/validatio datasets, use segments if the main audio's duration is too long, and slice to make the segments if they don't exist 2023-03-13 04:26:00 +07:00
mrq 0cf9db5e69 oops 2023-03-13 01:33:45 +07:00
mrq 050bcefd73 resample to 22.5K when creating training inputs (to avoid redundant downsampling when loaded for training, even though most of my inputs are already at 22.5K), generalized resampler function to cache and reuse them, do not unload whisper when done transcribing since it gets unloaded anyways for any other non-transcription task 2023-03-13 01:20:55 +07:00
mrq 7c9c0dc584 forgot to clean up debug prints 2023-03-13 00:44:37 +07:00
mrq 239c984850 move validating audio to creating the text files instead, consider audio longer than 11 seconds invalid, consider text lengths over 200 invalid 2023-03-12 23:39:00 +07:00
mrq 51ddc205cd update submodules 2023-03-12 18:14:36 +07:00
mrq ccbf2e6aff blame mrq/ai-voice-cloning#122 2023-03-12 17:51:52 +07:00
mrq 9238df0b03 fixed last generation settings not actually load because brain worms 2023-03-12 15:49:50 +07:00
mrq 9594a960b0 Disable loss ETA for now until I fix it 2023-03-12 15:39:54 +07:00
mrq 51f6c347fe Merge pull request 'updated several default configurations to not cause null/empty errors. also default samples/iterations to 16-30 ultra fast which is typically suggested.' (#122) from zim33/ai-voice-cloning:save_more_user_config into master
Reviewed-on: mrq/ai-voice-cloning#122
2023-03-12 15:38:34 +07:00
mrq be8b290a1a Merge branch 'master' into save_more_user_config 2023-03-12 15:38:08 +07:00
mrq 296129ba9c output fixes, I'm not sure why ETA wasn't working but it works in testing 2023-03-12 15:17:07 +07:00
mrq 098d7ad635 uh I don't remember, small things 2023-03-12 14:47:48 +07:00
tigi6346 233baa4e45 updated several default configurations to not cause null/empty errors. also default samples/iterations to 16-30 ultra fast which is typically suggested. 2023-03-12 16:08:02 +07:00
mrq 1ac278e885 Merge pull request 'keep_training' (#118) from zim33/ai-voice-cloning:keep_training into master
Reviewed-on: mrq/ai-voice-cloning#118
2023-03-12 06:47:01 +07:00