Commit Graph

31 Commits

Author SHA1 Message Date
mrq
ad024f400f actually pass language into dataset process script, fix coercing japanese into hiragana because espeak does not like kanji 2024-07-21 23:21:37 -05:00
mrq
7b210d9738 sanity cleanup 2024-07-04 15:58:08 -05:00
mrq
db62e55a38 oops, I forgot to use the new thing for audio_backend 2024-07-04 14:54:11 -05:00
mrq
7feeb944a0 probably insane with even entertaining going this route 2024-06-03 20:26:27 -05:00
mrq
ddbacde0d1 DAC just doesn't work well enough...... 2024-05-25 11:07:52 -05:00
mrq
74e531d391 ugh 2024-05-18 12:02:56 -05:00
mrq
59ef9461f8 ugh 2024-05-18 10:13:58 -05:00
mrq
d9aabfa3ae final tweaks, hopefully, again 2024-05-15 23:04:19 -05:00
mrq
2437a86efa ugh 2024-05-12 13:02:15 -05:00
mrq
4f1593c8db a bunch of shit to salvage my old encodec-quantized audio because dac-encoded audio just does not want to converge 2024-05-12 10:17:29 -05:00
mrq
c6e0f905b5 final tweaks (again) before training restarts 2024-05-08 02:11:38 -05:00
mrq
8aa1b2dabf documentation update 2024-05-04 21:03:46 -05:00
mrq
caad7ee3c9 final tweaks, hopefully 2024-04-28 22:28:29 -05:00
mrq
ffc334cf58 added dataset transcription helper script (now I don't ever have to touch ai-voice-cloning) (to-do: unify scripts into the module) 2024-04-21 17:43:20 -05:00
mrq
071fb97777 dataset preparation script updates, caved and am using HF tokenizer now 2024-04-21 14:49:18 -05:00
mrq
a8ffa88844 it slipped my mind that technically DAC can be used at any sample rate, since it models waveforms; make it a config YAML option to allow this behavior 2024-04-19 18:36:54 -05:00
mrq
00804a47e9 Forgot to copy intermediary dataset conversion script 2024-04-18 21:34:28 -05:00
mrq
4f5c9e518a actually use the passed-through sample rate from encode for DAC because it does its own resampling I guess 2024-04-18 13:32:41 -05:00
mrq
09cda7d3f9 added sampling by speaker group name (might be better to de-emphasize the LibriVox/Audiobooks that are in large numbers, and emphasize the smaller pools), log cleanup 2023-10-16 19:30:38 -05:00
mrq
2deb995cc9 updated setup script 2023-10-06 20:08:28 -05:00
mrq
1fd91b6437 cleanup 2023-10-06 10:13:54 -05:00
mrq
3db7e7dea1 implicitly load checkpoint if deepspeed checkpoint not found, updated setup script to grab the diskcached dataloader things 2023-10-06 10:02:45 -05:00
mrq
2f2505b12f updated setup script 2023-10-06 08:08:28 -05:00
mrq
153f8b293c added min-x and min-y arguments to plot.py, helper script to download from my existing checkpoint 2023-10-04 19:41:37 -05:00
mrq
5ac119a6e7 added light web UI (need to port the telemetry disabling bandaids from aivc) 2023-09-09 16:17:20 -05:00
mrq
4613781e23 integrated plot script, added tts-c task token to help the model be able to mix between normal VALL-E and VALL-E continuous 2023-09-02 16:29:53 -05:00
mrq
f7e942ec99 modified plotting script to be more agnostic to X 2023-09-02 13:59:43 -05:00
mrq
21e5d250cc fixed up plot script that I forgot about 2023-09-02 13:31:04 -05:00
mrq
5c8694db8e nasty bandaid if there's no validation dataset specified during training (for example, during finetunes) 2023-08-30 18:23:05 -05:00
mrq
7b3be3d7bf added helper scripts to process LibriTTS/LibriLight, detect duplicate speaker+books between them, and script to directly phonemize and quantize LibriTTS 2023-08-26 10:21:12 -05:00
mrq
bf8cedc9dd Rewrite init 2023-08-02 21:53:35 +00:00