vall-e/vall_e
2024-12-17 22:47:12 -06:00
..
emb actually do speaker verification 2024-12-17 10:11:14 -06:00
engines remove nan checks because it causes problems in distributed training because I'm not syncing between GPUs (and nan losses gets ignored anyways with loss scaling) 2024-12-15 09:42:54 -06:00
models actually do speaker verification 2024-12-17 10:11:14 -06:00
utils actually do speaker verification 2024-12-17 10:11:14 -06:00
__init__.py Rewrite init 2023-08-02 21:53:35 +00:00
__main__.py doc update, added automatically deducing language from a given text, also checks if the input is already phonemized text to allow direct control without being cringe (procrastinating adding WER/SIM-O) 2024-12-07 22:34:25 -06:00
config.py APOLLO tweaks to make it work with deepspeed 2024-12-13 23:03:52 -06:00
data.py tweaks to prompt duration to allow me to divorce how i use it for training with how I'm using it for the demo page, and demo page tweaks to make my life easier 2024-12-17 19:33:04 -06:00
demo.py cringe script to process seed-tts-eval's eval dataset into something i can easily use 2024-12-17 22:47:12 -06:00
export.py cringe code to convert to LlamaForCausalLM-happy weights + tokenizer dict (still need to write logic to actually use these weights for proper inferencing) 2024-12-03 10:18:58 -06:00
inference.py tweaks to prompt duration to allow me to divorce how i use it for training with how I'm using it for the demo page, and demo page tweaks to make my life easier 2024-12-17 19:33:04 -06:00
metrics.py actually do proper wer/cer calculation by un-normalizing the scores 2024-12-17 14:22:30 -06:00
plot.py very, very naive layerskip speculative sampling (it just checks if the current layer's state is good enough) 2024-11-02 11:49:05 -05:00
samplers.py sort batches to try and reduce number of padded tokens in batched inference (also commented out F5 samples getting added to the demo page because I would have to regenerate them) 2024-12-11 22:45:38 -06:00
train.py remove nan checks because it causes problems in distributed training because I'm not syncing between GPUs (and nan losses gets ignored anyways with loss scaling) 2024-12-15 09:42:54 -06:00
webui.py really shoddy voice conversion implementation (it sort of works...) 2024-12-16 22:54:53 -06:00