|
32d968a8cd
|
(disabled by default until i validate it working) added additional transcription text normalization (something else I'm experimenting with requires it)
|
2023-03-13 19:07:23 +00:00 |
|
|
66ac8ba766
|
added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creation
|
2023-03-13 18:51:53 +00:00 |
|
|
ee1b048d07
|
when creating the train/validatio datasets, use segments if the main audio's duration is too long, and slice to make the segments if they don't exist
|
2023-03-13 04:26:00 +00:00 |
|
|
0cf9db5e69
|
oops
|
2023-03-13 01:33:45 +00:00 |
|
|
050bcefd73
|
resample to 22.5K when creating training inputs (to avoid redundant downsampling when loaded for training, even though most of my inputs are already at 22.5K), generalized resampler function to cache and reuse them, do not unload whisper when done transcribing since it gets unloaded anyways for any other non-transcription task
|
2023-03-13 01:20:55 +00:00 |
|
|
7c9c0dc584
|
forgot to clean up debug prints
|
2023-03-13 00:44:37 +00:00 |
|
|
239c984850
|
move validating audio to creating the text files instead, consider audio longer than 11 seconds invalid, consider text lengths over 200 invalid
|
2023-03-12 23:39:00 +00:00 |
|
|
51ddc205cd
|
update submodules
|
2023-03-12 18:14:36 +00:00 |
|
|
ccbf2e6aff
|
blame mrq/ai-voice-cloning#122
|
2023-03-12 17:51:52 +00:00 |
|
|
9238df0b03
|
fixed last generation settings not actually load because brain worms
|
2023-03-12 15:49:50 +00:00 |
|
|
9594a960b0
|
Disable loss ETA for now until I fix it
|
2023-03-12 15:39:54 +00:00 |
|
|
296129ba9c
|
output fixes, I'm not sure why ETA wasn't working but it works in testing
|
2023-03-12 15:17:07 +00:00 |
|
|
098d7ad635
|
uh I don't remember, small things
|
2023-03-12 14:47:48 +00:00 |
|
|
29b3d1ae1d
|
Fixed Keep X Previous States
|
2023-03-12 08:01:08 +02:00 |
|
|
61500107ab
|
Catch OOM and run whisper on cpu automatically.
|
2023-03-12 06:48:28 +02:00 |
|
|
ede9804b76
|
added option to trim silence using torchaudio's VAD
|
2023-03-11 21:41:35 +00:00 |
|
|
dea2fa9caf
|
added fields to offset start/end slices to apply in bulk when slicing
|
2023-03-11 21:34:29 +00:00 |
|
|
382a3e4104
|
rely on the whisper.json for handling a lot more things
|
2023-03-11 21:17:11 +00:00 |
|
|
9b376c381f
|
brain worm
|
2023-03-11 18:14:32 +00:00 |
|
|
94551fb9ac
|
split slicing dataset routine so it can be done after the fact
|
2023-03-11 17:27:01 +00:00 |
|
|
e3fdb79b49
|
rocm5.2 works for me desu so I bumped it back up
|
2023-03-11 17:02:56 +00:00 |
|
|
cf41492f76
|
fall back to normal behavior if theres actually no audiofiles loaded from the dataset when using it for computing latents
|
2023-03-11 16:46:03 +00:00 |
|
|
b90c164778
|
Farewell, parasite
|
2023-03-11 16:40:34 +00:00 |
|
|
2424c455cb
|
added option to not slice audio when transcribing, added option to prepare validation dataset on audio duration, added a warning if youre using whisperx and you're slicing audio
|
2023-03-11 16:32:35 +00:00 |
|
|
008a1f5f8f
|
simplified spawning the training process by having it spawn the distributed training processes in the train.py script, so it should work on Windows too
|
2023-03-11 01:37:00 +00:00 |
|
|
2feb6da0c0
|
cleanups and fixes, fix DLAS throwing errors from '''too short of sound files''' by just culling them during transcription
|
2023-03-11 01:19:49 +00:00 |
|
|
7f2da0f5fb
|
rewrote how AIVC gets training metrics (need to clean up later)
|
2023-03-10 22:35:32 +00:00 |
|
|
df0edacc60
|
fix the cleanup actually only doing 2 despite requesting more than 2, surprised no one has pointed it out
|
2023-03-10 14:04:07 +00:00 |
|
|
8e890d3023
|
forgot to fix reset settings to use the new arg-agnostic way
|
2023-03-10 13:49:39 +00:00 |
|
|
c92b006129
|
I really hate YAML
|
2023-03-10 03:48:46 +00:00 |
|
|
eb1551ee92
|
what I thought was an override and not a ternary
|
2023-03-09 23:04:02 +00:00 |
|
|
c3b43d2429
|
today I learned adamw_zero actually negates ANY LR schemes
|
2023-03-09 19:42:31 +00:00 |
|
|
cb273b8428
|
cleanup
|
2023-03-09 18:34:52 +00:00 |
|
|
7c71f7239c
|
expose options for CosineAnnealingLR_Restart (seems to be able to train very quickly due to the restarts
|
2023-03-09 14:17:01 +00:00 |
|
|
5460e191b0
|
added loss graph, because I'm going to experiment with cosine annealing LR and I need to view my loss
|
2023-03-09 05:54:08 +00:00 |
|
|
a182df8f4e
|
is
|
2023-03-09 04:33:12 +00:00 |
|
|
a01eb10960
|
(try to) unload voicefixer if it raises an error during loading voicefixer
|
2023-03-09 04:28:14 +00:00 |
|
|
dc1902b91c
|
cleanup block that makes embedding latents for random/microphone happen, remove builtin voice options from voice list to avoid duplicates
|
2023-03-09 04:23:36 +00:00 |
|
|
797882336b
|
maybe remedy an issue that crops up if you have a non-wav and non-json file in a results folder (assuming)
|
2023-03-09 04:06:07 +00:00 |
|
|
3b4f4500d1
|
when you have three separate machines running and you test one one, but you accidentally revert changes because you then test on another
|
2023-03-09 03:26:18 +00:00 |
|
|
ef75dba995
|
I hate commas make tuples
|
2023-03-09 02:43:05 +00:00 |
|
|
f795dd5c20
|
you might be wondering why so many small commits instead of rolling the HEAD back one to just combine them, i don't want to force push and roll back the paperspace i'm testing in
|
2023-03-09 02:31:32 +00:00 |
|
|
51339671ec
|
typo
|
2023-03-09 02:29:08 +00:00 |
|
|
1b18b3e335
|
forgot to save the simplified training input json first before touching any of the settings that dump to the yaml
|
2023-03-09 02:27:20 +00:00 |
|
|
0e80e311b0
|
added VRAM validation for a given batch:gradient accumulation size ratio (based emprically off of 6GiB, 16GiB, and 16x2GiB, would be nice to have more data on what's safe)
|
2023-03-09 02:08:06 +00:00 |
|
|
ef7b957fff
|
oops
|
2023-03-09 00:53:00 +00:00 |
|
|
b0baa1909a
|
forgot template
|
2023-03-09 00:32:35 +00:00 |
|
|
3f321fe664
|
big cleanup to make my life easier when i add more parameters
|
2023-03-09 00:26:47 +00:00 |
|
|
0ab091e7ff
|
oops
|
2023-03-08 16:09:29 +00:00 |
|
|
34dcb845b5
|
actually make using adamw_zero optimizer for multi-gpus work
|
2023-03-08 15:31:33 +00:00 |
|