|
a657623cbc
|
updated vall-e training template to use path-based speakers because it would just have a batch/epoch size of 1 otherwise; revert hardcoded 'spit processed dataset to this path' from my training rig to spit it out in a sane spot
|
2023-08-24 21:45:50 +00:00 |
|
|
f5fab33e9c
|
fixed defaults for vall-e backend
|
2023-08-24 19:44:52 +00:00 |
|
|
6c3f48efba
|
uses gitmylo/bark-voice-cloning-HuBERT-quantizer for creating custom voices (it slightly works better over the base method, but still not very good desu)
|
2023-07-03 02:46:10 +00:00 |
|
|
baa6b76b85
|
added gradio API for changing AR model
|
2023-05-21 23:20:39 +00:00 |
|
|
31da215c5f
|
added checkboxes to use the original method for calculating latents (ignores the voice chunk field)
|
2023-05-21 01:47:48 +00:00 |
|
|
5003bc89d3
|
cleaned up brain worms with wrapping around gradio progress by instead just using tqdm directly (slight regressions with some messages not getting pushed)
|
2023-05-04 23:40:33 +00:00 |
|
|
853c7fdccf
|
forgot to uncomment the block to transcribe and slice when using transcribe all because I was piece-processing a huge batch of LibriTTS and somehow that leaked over to the repo
|
2023-05-03 21:31:37 +00:00 |
|
|
c5e9b407fa
|
boolean oops
|
2023-04-27 14:40:22 +00:00 |
|
|
3978921e71
|
forgot to make the transcription tab visible with the bark backend (god the code is a mess now, I'll suck you off if you clean this up for me (not really))
|
2023-04-26 04:55:10 +00:00 |
|
|
b6440091fb
|
Very, very, VERY, barebones integration with Bark (documentation soon)
|
2023-04-26 04:48:09 +00:00 |
|
|
faa8da12d7
|
modified logic to determine valid voice folders, also allows subdirs within the folder (for example: ./voices/SH/james/ will be named SH/james)
|
2023-04-13 21:10:38 +00:00 |
|
|
02beb1dd8e
|
should fix #203
|
2023-04-13 03:14:06 +00:00 |
|
|
8f3e9447ba
|
disable diarize button
|
2023-04-12 20:03:54 +00:00 |
|
|
d8b996911c
|
a bunch of shit i had uncommited over the past while pertaining to VALL-E
|
2023-04-12 20:02:46 +00:00 |
|
|
9f64153a28
|
fixes #185
|
2023-03-31 06:03:56 +00:00 |
|
|
4744120be2
|
added VALL-E inference support (very rudimentary, gimped, but it will load a model trained on a config generated through the web UI)
|
2023-03-31 03:26:00 +00:00 |
|
|
9b01377667
|
only include auto in the list of models under setting, nothing else
|
2023-03-29 19:53:23 +00:00 |
|
|
f66281f10c
|
added mixing models (shamelessly inspired from voldy's web ui)
|
2023-03-29 19:29:13 +00:00 |
|
|
fd9b2e082c
|
x_lim and y_lim for graph
|
2023-03-25 02:34:14 +00:00 |
|
|
9856db5900
|
actually make parsing VALL-E metrics work
|
2023-03-23 15:42:51 +00:00 |
|
|
19c0854e6a
|
do not write current whisper.json if there's no changes
|
2023-03-22 22:24:07 +00:00 |
|
|
5a5fd9ca87
|
Added option to unsqueeze sample batches after sampling
|
2023-03-21 21:34:26 +00:00 |
|
|
9657c1d4ce
|
oops
|
2023-03-21 20:31:01 +00:00 |
|
|
0c2a9168f8
|
DLAS is PIPified (but I'm still cloning it as a submodule to make updating it easier)
|
2023-03-21 15:46:53 +00:00 |
|
|
34ef0467b9
|
VALL-E config edits
|
2023-03-20 01:22:53 +00:00 |
|
|
2e33bf071a
|
forgot to not require it to be relative
|
2023-03-19 22:05:33 +00:00 |
|
|
5cb86106ce
|
option to set results folder location
|
2023-03-19 22:03:41 +00:00 |
|
|
249c6019af
|
cleanup, metrics are grabbed for vall-e trainer
|
2023-03-17 05:33:49 +00:00 |
|
|
0408d44602
|
fixed reload tts being broken due to being as untouched as I am
|
2023-03-16 14:24:44 +00:00 |
|
|
f9154c4db1
|
fixes
|
2023-03-16 14:19:56 +00:00 |
|
|
ee8270bdfb
|
preparations for training an IPA-based finetune
|
2023-03-16 04:25:33 +00:00 |
|
|
363d0b09b1
|
added options to pick tokenizer json and diffusion model (so I don't have to add it in later when I get bored and add in diffusion training)
|
2023-03-15 00:37:38 +00:00 |
|
|
07b684c4e7
|
removed redundant training data (they exist within tortoise itself anyways), added utility: view tokenized text
|
2023-03-14 21:51:27 +00:00 |
|
|
469dd47a44
|
fixes #131
|
2023-03-14 18:58:03 +00:00 |
|
|
4b952ea52a
|
fixes #132
|
2023-03-14 18:46:20 +00:00 |
|
|
fe03ae5839
|
fixes
|
2023-03-14 17:42:42 +00:00 |
|
|
54036fd780
|
:)
|
2023-03-14 05:02:14 +00:00 |
|
|
66ac8ba766
|
added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creation
|
2023-03-13 18:51:53 +00:00 |
|
|
ccbf2e6aff
|
blame mrq/ai-voice-cloning#122
|
2023-03-12 17:51:52 +00:00 |
|
|
9238df0b03
|
fixed last generation settings not actually load because brain worms
|
2023-03-12 15:49:50 +00:00 |
|
|
9594a960b0
|
Disable loss ETA for now until I fix it
|
2023-03-12 15:39:54 +00:00 |
|
mrq
|
be8b290a1a
|
Merge branch 'master' into save_more_user_config
|
2023-03-12 15:38:08 +00:00 |
|
|
098d7ad635
|
uh I don't remember, small things
|
2023-03-12 14:47:48 +00:00 |
|
|
233baa4e45
|
updated several default configurations to not cause null/empty errors. also default samples/iterations to 16-30 ultra fast which is typically suggested.
|
2023-03-12 16:08:02 +02:00 |
|
|
9e320a34c8
|
Fixed Keep X Previous States
|
2023-03-12 08:00:03 +02:00 |
|
|
ede9804b76
|
added option to trim silence using torchaudio's VAD
|
2023-03-11 21:41:35 +00:00 |
|
|
dea2fa9caf
|
added fields to offset start/end slices to apply in bulk when slicing
|
2023-03-11 21:34:29 +00:00 |
|
|
89bb3d4419
|
rename transcribe button since it does more than transcribe
|
2023-03-11 21:18:04 +00:00 |
|
|
382a3e4104
|
rely on the whisper.json for handling a lot more things
|
2023-03-11 21:17:11 +00:00 |
|
|
94551fb9ac
|
split slicing dataset routine so it can be done after the fact
|
2023-03-11 17:27:01 +00:00 |
|