vall-e

mrq/vall-e

History

mrq cbf6b84e27 fixed grad norm and loss scale not reporting for local trainer		2025-02-23 19:08:26 -06:00
..
ext	fixed grad norm and loss scale not reporting for local trainer	2025-02-23 19:08:26 -06:00
__init__.py
distributed.py
io.py
ml.py	borrowed muon since it might better work under deepspeed and not require cruft (even though it really does not like the masked-NAR, also make the masked-NAR faux-causal since it might better help out for cfg.model.version >= 7	2025-02-23 17:23:24 -06:00
pattern.py
sampler.py
trainer.py
utils.py