vall-e/scripts
2024-04-28 22:28:29 -05:00
..
cleanup_dataset.py final tweaks, hopefully 2024-04-28 22:28:29 -05:00
deduplicate_librilight_libritts.py added helper scripts to process LibriTTS/LibriLight, detect duplicate speaker+books between them, and script to directly phonemize and quantize LibriTTS 2023-08-26 10:21:12 -05:00
parse_ppp.py added sampling by speaker group name (might be better to de-emphasize the LibriVox/Audiobooks that are in large numbers, and emphasize the smaller pools), log cleanup 2023-10-16 19:30:38 -05:00
prepare_librilight.py dataset preparation script updates, caved and am using HF tokenizer now 2024-04-21 14:49:18 -05:00
prepare_libritts.py added helper scripts to process LibriTTS/LibriLight, detect duplicate speaker+books between them, and script to directly phonemize and quantize LibriTTS 2023-08-26 10:21:12 -05:00
process_dataset.py final tweaks, hopefully 2024-04-28 22:28:29 -05:00
process_libritts.py actually use the passed-through sample rate from encode for DAC because it does its own resampling I guess 2024-04-18 13:32:41 -05:00
run.sh nasty bandaid if there's no validation dataset specified during training (for example, during finetunes) 2023-08-30 18:23:05 -05:00
setup-training.sh cleanup 2023-10-06 10:13:54 -05:00
setup.sh updated setup script 2023-10-06 20:08:28 -05:00
train_tokenizer.py dataset preparation script updates, caved and am using HF tokenizer now 2024-04-21 14:49:18 -05:00
transcribe_dataset.py final tweaks, hopefully 2024-04-28 22:28:29 -05:00