vall-e/vall_e/utils
2025-02-28 01:06:38 -06:00
..
ext made muon actually work by actually utilizing param groups (thanks APOLLO for reminding me this is the sane way to handle this split) 2025-02-26 10:39:13 -06:00
__init__.py ugh 2025-02-28 01:06:38 -06:00
distributed.py
io.py agony 2024-12-21 22:52:10 -06:00
ml.py borrowed muon since it might better work under deepspeed and not require cruft (even though it really does not like the masked-NAR, also make the masked-NAR faux-causal since it might better help out for cfg.model.version >= 7 2025-02-23 17:23:24 -06:00
pattern.py
sampler.py tweaks to bucket sampling 2024-11-13 11:09:24 -06:00
trainer.py when you already had these ideas to stabilize training but you just ignored them 2025-02-27 23:39:20 -06:00
utils.py ugh 2025-02-28 01:06:38 -06:00