vall-e

mrq b39aaacd77 oops	2025-02-23 11:55:43 -06:00
..
emb	fixes fixes fixes (a quarter of my recently processed audio returned zero'd tensors......)	2025-02-22 09:07:33 -06:00
engines	separate mask token and stop token because this might cause issues	2025-02-23 11:36:32 -06:00
models	oops	2025-02-23 11:55:43 -06:00
utils	separate mask token and stop token because this might cause issues	2025-02-23 11:36:32 -06:00
__init__.py	Rewrite init	2023-08-02 21:53:35 +00:00
__main__.py	added option to playback audio directly, removed no-phonemize option since I swear it worked in testing but it doesn't actually work	2025-01-12 21:52:49 -06:00
config.py	fixes fixes fixes (a quarter of my recently processed audio returned zero'd tensors......)	2025-02-22 09:07:33 -06:00
data.py	(finally) added parallel AR for cfg.model.version >= 7 (nvidia/audio-codec-44khz is being a pain and it might require training purely AR first......)	2025-02-23 08:31:03 -06:00
demo.py	sanity checks (and I realized that the model actually had langs set to 4 in the yaml for KO/ZH so................	2024-12-19 19:08:57 -06:00
export.py		2024-12-26 21:42:17 -06:00
inference.py	added muon optimizer through kludge hacks because it necessitates a second optimizer in tandum that seems to only sometimes work with deepspeed	2025-02-23 11:22:13 -06:00
metrics.py	instead just compute a bunch of stuff on the transcriptions to store later in different names so I can just retrieve what I want, also added tongue twisters for nefarious reasons	2024-12-18 23:43:11 -06:00
plot.py	very, very naive layerskip speculative sampling (it just checks if the current layer's state is good enough)	2024-11-02 11:49:05 -05:00
samplers.py	agony	2025-02-12 00:18:24 -06:00
train.py	oops	2025-01-05 23:53:17 -06:00
webui.py	added option to playback audio directly, removed no-phonemize option since I swear it worked in testing but it doesn't actually work	2025-01-12 21:52:49 -06:00