vall-e/vall_e/models
2025-02-26 10:39:13 -06:00
..
arch added muon optimizer through kludge hacks because it necessitates a second optimizer in tandum that seems to only sometimes work with deepspeed 2025-02-23 11:22:13 -06:00
__init__.py 2024-12-26 21:42:17 -06:00
ar_nar.py made muon actually work by actually utilizing param groups (thanks APOLLO for reminding me this is the sane way to handle this split) 2025-02-26 10:39:13 -06:00
base.py made muon actually work by actually utilizing param groups (thanks APOLLO for reminding me this is the sane way to handle this split) 2025-02-26 10:39:13 -06:00
lora.py