vall-e

mrq/vall-e

History

mrq ceecac6ffe I think I made resp_parallel_training=True faster with loss factoring?		2025-02-26 23:13:32 -06:00
..
__init__.py	lol	2025-02-26 10:49:06 -06:00
base.py	I think I made resp_parallel_training=True faster with loss factoring?	2025-02-26 23:13:32 -06:00
deepspeed.py	added muon optimizer through kludge hacks because it necessitates a second optimizer in tandum that seems to only sometimes work with deepspeed	2025-02-23 11:22:13 -06:00