vall-e

mrq/vall-e

History

mrq e46d7ef2cb warn and ignore export when lora training because the state dict exported during training is wrong		2025-05-20 23:38:10 -05:00
..
ext	made muon actually work by actually utilizing param groups (thanks APOLLO for reminding me this is the sane way to handle this split)	2025-02-26 10:39:13 -06:00
__init__.py	tweaks	2025-03-02 22:36:25 -06:00
distributed.py	moved prints to use logger, edited readme (fused_attn doesnt seem stable for training)	2024-08-29 13:27:16 -05:00
io.py	added option to explicitly load a lora without having to lobotomize yourself with creating a yaml just to do so	2025-05-20 23:28:29 -05:00
ml.py	a birdie tells me i should probably use a different optimizer (also preliminary support for native sparse attention but I don't know if I'll use it)	2025-03-04 14:53:02 -06:00
pattern.py	oops, kept forgetting to actually pass in lang/tone tokens (despite not really using these at the moment)	2024-07-18 14:18:34 -05:00
sampler.py	tweaks to bucket sampling	2024-11-13 11:09:24 -06:00
trainer.py	warn and ignore export when lora training because the state dict exported during training is wrong	2025-05-20 23:38:10 -05:00
utils.py	cannot get segmented mask to actually work without gradients exploding (need to find a different way to do duration prediction...)	2025-03-27 00:51:41 -05:00