3978921e71
forgot to make the transcription tab visible with the bark backend (god the code is a mess now, I'll suck you off if you clean this up for me (not really))
2023-04-26 04:55:10 +00:00
b6440091fb
Very, very, VERY, barebones integration with Bark (documentation soon)
2023-04-26 04:48:09 +00:00
faa8da12d7
modified logic to determine valid voice folders, also allows subdirs within the folder (for example: ./voices/SH/james/ will be named SH/james)
2023-04-13 21:10:38 +00:00
02beb1dd8e
should fix #203
2023-04-13 03:14:06 +00:00
8f3e9447ba
disable diarize button
2023-04-12 20:03:54 +00:00
d8b996911c
a bunch of shit i had uncommited over the past while pertaining to VALL-E
2023-04-12 20:02:46 +00:00
9f64153a28
fixes #185
2023-03-31 06:03:56 +00:00
4744120be2
added VALL-E inference support (very rudimentary, gimped, but it will load a model trained on a config generated through the web UI)
2023-03-31 03:26:00 +00:00
9b01377667
only include auto in the list of models under setting, nothing else
2023-03-29 19:53:23 +00:00
f66281f10c
added mixing models (shamelessly inspired from voldy's web ui)
2023-03-29 19:29:13 +00:00
fd9b2e082c
x_lim and y_lim for graph
2023-03-25 02:34:14 +00:00
9856db5900
actually make parsing VALL-E metrics work
2023-03-23 15:42:51 +00:00
19c0854e6a
do not write current whisper.json if there's no changes
2023-03-22 22:24:07 +00:00
5a5fd9ca87
Added option to unsqueeze sample batches after sampling
2023-03-21 21:34:26 +00:00
9657c1d4ce
oops
2023-03-21 20:31:01 +00:00
0c2a9168f8
DLAS is PIPified (but I'm still cloning it as a submodule to make updating it easier)
2023-03-21 15:46:53 +00:00
34ef0467b9
VALL-E config edits
2023-03-20 01:22:53 +00:00
2e33bf071a
forgot to not require it to be relative
2023-03-19 22:05:33 +00:00
5cb86106ce
option to set results folder location
2023-03-19 22:03:41 +00:00
249c6019af
cleanup, metrics are grabbed for vall-e trainer
2023-03-17 05:33:49 +00:00
0408d44602
fixed reload tts being broken due to being as untouched as I am
2023-03-16 14:24:44 +00:00
f9154c4db1
fixes
2023-03-16 14:19:56 +00:00
ee8270bdfb
preparations for training an IPA-based finetune
2023-03-16 04:25:33 +00:00
363d0b09b1
added options to pick tokenizer json and diffusion model (so I don't have to add it in later when I get bored and add in diffusion training)
2023-03-15 00:37:38 +00:00
07b684c4e7
removed redundant training data (they exist within tortoise itself anyways), added utility: view tokenized text
2023-03-14 21:51:27 +00:00
469dd47a44
fixes #131
2023-03-14 18:58:03 +00:00
4b952ea52a
fixes #132
2023-03-14 18:46:20 +00:00
fe03ae5839
fixes
2023-03-14 17:42:42 +00:00
54036fd780
:)
2023-03-14 05:02:14 +00:00
66ac8ba766
added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creation
2023-03-13 18:51:53 +00:00
ccbf2e6aff
blame mrq/ai-voice-cloning#122
2023-03-12 17:51:52 +00:00
9238df0b03
fixed last generation settings not actually load because brain worms
2023-03-12 15:49:50 +00:00
9594a960b0
Disable loss ETA for now until I fix it
2023-03-12 15:39:54 +00:00
mrq
be8b290a1a
Merge branch 'master' into save_more_user_config
2023-03-12 15:38:08 +00:00
098d7ad635
uh I don't remember, small things
2023-03-12 14:47:48 +00:00
233baa4e45
updated several default configurations to not cause null/empty errors. also default samples/iterations to 16-30 ultra fast which is typically suggested.
2023-03-12 16:08:02 +02:00
9e320a34c8
Fixed Keep X Previous States
2023-03-12 08:00:03 +02:00
ede9804b76
added option to trim silence using torchaudio's VAD
2023-03-11 21:41:35 +00:00
dea2fa9caf
added fields to offset start/end slices to apply in bulk when slicing
2023-03-11 21:34:29 +00:00
89bb3d4419
rename transcribe button since it does more than transcribe
2023-03-11 21:18:04 +00:00
382a3e4104
rely on the whisper.json for handling a lot more things
2023-03-11 21:17:11 +00:00
94551fb9ac
split slicing dataset routine so it can be done after the fact
2023-03-11 17:27:01 +00:00
2424c455cb
added option to not slice audio when transcribing, added option to prepare validation dataset on audio duration, added a warning if youre using whisperx and you're slicing audio
2023-03-11 16:32:35 +00:00
tigi6346
dcdcf8516c
master ( #112 )
...
Fixes Gradio bugging out when attempting to load a missing train.json.
Reviewed-on: mrq/ai-voice-cloning#112
Co-authored-by: tigi6346 <tigi6346@noreply.localhost>
Co-committed-by: tigi6346 <tigi6346@noreply.localhost>
2023-03-11 03:28:04 +00:00
7f2da0f5fb
rewrote how AIVC gets training metrics (need to clean up later)
2023-03-10 22:35:32 +00:00
8e890d3023
forgot to fix reset settings to use the new arg-agnostic way
2023-03-10 13:49:39 +00:00
cb273b8428
cleanup
2023-03-09 18:34:52 +00:00
7c71f7239c
expose options for CosineAnnealingLR_Restart (seems to be able to train very quickly due to the restarts
2023-03-09 14:17:01 +00:00
2f6dd9c076
some cleanup
2023-03-09 06:20:05 +00:00
5460e191b0
added loss graph, because I'm going to experiment with cosine annealing LR and I need to view my loss
2023-03-09 05:54:08 +00:00