vall-e/vall_e
2025-03-25 18:53:06 -05:00
..
emb fixed dac 2025-03-12 23:17:27 -05:00
engines added more notes (although I could have sworn I have had more notes that i can't recall) 2025-03-25 18:53:06 -05:00
models fixed errant index error (although it makes me wonder if my segmented masking is still flawed) 2025-03-21 23:41:34 -05:00
utils stuff for interfacing with the loss scaler value (because I want to cap it) 2025-03-06 17:07:29 -06:00
__init__.py Rewrite init 2023-08-02 21:53:35 +00:00
__main__.py added option to playback audio directly, removed no-phonemize option since I swear it worked in testing but it doesn't actually work 2025-01-12 21:52:49 -06:00
config.py add segmented sliding attention, also found a bug with prom-less segments in the attention mask generation......... 2025-03-21 19:05:49 -05:00
data.py another dataloader optimization 2025-03-15 20:18:58 -05:00
demo.py ugh 2025-02-28 01:06:38 -06:00
export.py 2024-12-26 21:42:17 -06:00
inference.py actually do duration prediction 2025-03-11 22:14:54 -05:00
metrics.py instead just compute a bunch of stuff on the transcriptions to store later in different names so I can just retrieve what I want, also added tongue twisters for nefarious reasons 2024-12-18 23:43:11 -06:00
plot.py very, very naive layerskip speculative sampling (it just checks if the current layer's state is good enough) 2024-11-02 11:49:05 -05:00
samplers.py agony 2025-02-12 00:18:24 -06:00
train.py ugh 2025-03-15 16:50:21 -05:00
webui.py more tweaks to the new implementation (properly trim the len stuff to save some params, decoder to d_ffn expansion to 2 to maybe also make it faster, etc.) 2025-03-18 19:34:37 -05:00