d752a22331print a warning if automatically deduced batch size returns 1mrq2023-03-15 01:20:15 +0000
f6d34e1dd3and maybe I should have actually tested with ./models/tokenizers/ mademrq2023-03-15 01:09:20 +0000
5e4f6808ceI guess I didn't test on a blank-ish slatemrq2023-03-15 00:54:27 +0000
363d0b09b1added options to pick tokenizer json and diffusion model (so I don't have to add it in later when I get bored and add in diffusion training)mrq2023-03-15 00:37:38 +0000
07b684c4e7removed redundant training data (they exist within tortoise itself anyways), added utility: view tokenized textmrq2023-03-14 21:51:27 +0000
92a05d3c4cadded PYTHONUTF8 to start/train batsmrq2023-03-14 02:29:11 +0000
dadb1fca6bmultichannel audio now report correct duration (surprised it took this long for me to source multichannel audio)mrq2023-03-13 21:24:51 +0000
32d968a8cd(disabled by default until i validate it working) added additional transcription text normalization (something else I'm experimenting with requires it)mrq2023-03-13 19:07:23 +0000
66ac8ba766added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creationmrq2023-03-13 18:51:53 +0000
ee1b048d07when creating the train/validatio datasets, use segments if the main audio's duration is too long, and slice to make the segments if they don't existmrq2023-03-13 04:26:00 +0000
050bcefd73resample to 22.5K when creating training inputs (to avoid redundant downsampling when loaded for training, even though most of my inputs are already at 22.5K), generalized resampler function to cache and reuse them, do not unload whisper when done transcribing since it gets unloaded anyways for any other non-transcription taskmrq2023-03-13 01:20:55 +0000
7c9c0dc584forgot to clean up debug printsmrq2023-03-13 00:44:37 +0000
239c984850move validating audio to creating the text files instead, consider audio longer than 11 seconds invalid, consider text lengths over 200 invalidmrq2023-03-12 23:39:00 +0000
9238df0b03fixed last generation settings not actually load because brain wormsmrq2023-03-12 15:49:50 +0000
9594a960b0Disable loss ETA for now until I fix itmrq2023-03-12 15:39:54 +0000
51f6c347feMerge pull request 'updated several default configurations to not cause null/empty errors. also default samples/iterations to 16-30 ultra fast which is typically suggested.' (#122) from zim33/ai-voice-cloning:save_more_user_config into master
mrq
2023-03-12 15:38:34 +0000
be8b290a1aMerge branch 'master' into save_more_user_config
mrq
2023-03-12 15:38:08 +0000
296129ba9coutput fixes, I'm not sure why ETA wasn't working but it works in testingmrq2023-03-12 15:17:07 +0000
098d7ad635uh I don't remember, small thingsmrq2023-03-12 14:47:48 +0000
233baa4e45updated several default configurations to not cause null/empty errors. also default samples/iterations to 16-30 ultra fast which is typically suggested.tigi63462023-03-12 16:08:02 +0200
1ac278e885Merge pull request 'keep_training' (#118) from zim33/ai-voice-cloning:keep_training into master
mrq
2023-03-12 06:47:01 +0000
8ed09f9b87Merge pull request 'Catch OOM and run whisper on cpu automatically.' (#117) from zim33/ai-voice-cloning:vram into master
mrq
2023-03-12 05:09:53 +0000
61500107abCatch OOM and run whisper on cpu automatically.tigi63462023-03-12 06:48:28 +0200
ede9804b76added option to trim silence using torchaudio's VADmrq2023-03-11 21:41:35 +0000
dea2fa9cafadded fields to offset start/end slices to apply in bulk when slicingmrq2023-03-11 21:34:29 +0000
89bb3d4419rename transcribe button since it does more than transcribemrq2023-03-11 21:18:04 +0000
382a3e4104rely on the whisper.json for handling a lot more thingsmrq2023-03-11 21:17:11 +0000
94551fb9acsplit slicing dataset routine so it can be done after the factmrq2023-03-11 17:27:01 +0000
e3fdb79b49rocm5.2 works for me desu so I bumped it back upmrq2023-03-11 17:02:56 +0000
e680d84a13removed the hotfix pip installs that whisperx requires now that whisperx is gonemrq2023-03-11 16:55:19 +0000
cf41492f76fall back to normal behavior if theres actually no audiofiles loaded from the dataset when using it for computing latentsmrq2023-03-11 16:46:03 +0000
2424c455cbadded option to not slice audio when transcribing, added option to prepare validation dataset on audio duration, added a warning if youre using whisperx and you're slicing audiomrq2023-03-11 16:32:35 +0000
008a1f5f8fsimplified spawning the training process by having it spawn the distributed training processes in the train.py script, so it should work on Windows toomrq2023-03-11 01:37:00 +0000
2feb6da0c0cleanups and fixes, fix DLAS throwing errors from '''too short of sound files''' by just culling them during transcriptionmrq2023-03-11 01:19:49 +0000
7f2da0f5fbrewrote how AIVC gets training metrics (need to clean up later)mrq2023-03-10 22:35:32 +0000
df0edacc60fix the cleanup actually only doing 2 despite requesting more than 2, surprised no one has pointed it outmrq2023-03-10 14:04:07 +0000
8e890d3023forgot to fix reset settings to use the new arg-agnostic waymrq2023-03-10 13:49:39 +0000
0b364b590emaybe don't --force-reinstall to try and force downgrading, it just forces everything to uninstall then reinstallmrq2023-03-10 04:22:47 +0000
c231d842aamake dependencies after the one in this repo force reinstall to downgrade, i hope, I hav eother things to do than validate this worksmrq2023-03-10 03:53:21 +0000
c92b006129I really hate YAMLmrq2023-03-10 03:48:46 +0000
d3184004fdonly God knows why the YAML spec lets you specify string values without quotesmrq2023-03-10 01:58:30 +0000
eb1551ee92what I thought was an override and not a ternarymrq2023-03-09 23:04:02 +0000
c3b43d2429today I learned adamw_zero actually negates ANY LR schemesmrq2023-03-09 19:42:31 +0000
a01eb10960(try to) unload voicefixer if it raises an error during loading voicefixermrq2023-03-09 04:28:14 +0000
dc1902b91ccleanup block that makes embedding latents for random/microphone happen, remove builtin voice options from voice list to avoid duplicatesmrq2023-03-09 04:23:36 +0000
797882336bmaybe remedy an issue that crops up if you have a non-wav and non-json file in a results folder (assuming)mrq2023-03-09 04:06:07 +0000
b64948d966while I'm breaking things, migrating dependencies to modules folder for tidinessmrq2023-03-09 04:03:57 +0000
b8867a5fb0added the mysterious tortoise_compat flag mentioned in DLAS repomrq2023-03-09 03:41:40 +0000