This website requires JavaScript.
Explore
Help
Register
Sign In
mrq
/
vall-e
Watch
5
Star
9
Fork
0
You've already forked vall-e
Code
Issues
8
Pull Requests
Packages
Projects
Releases
Wiki
Activity
b640fabab5
vall-e
/
vall_e
/
utils
History
mrq
b640fabab5
borrowed muon since it might better work under deepspeed and not require cruft (even though it really does not like the masked-NAR, also make the masked-NAR faux-causal since it might better help out for cfg.model.version >= 7
2025-02-23 17:23:24 -06:00
..
ext
borrowed muon since it might better work under deepspeed and not require cruft (even though it really does not like the masked-NAR, also make the masked-NAR faux-causal since it might better help out for cfg.model.version >= 7
2025-02-23 17:23:24 -06:00
__init__.py
added WER/SIM-O metrics, added APOLLO but I need to test it
2024-12-10 20:13:21 -06:00
distributed.py
io.py
agony
2024-12-21 22:52:10 -06:00
ml.py
borrowed muon since it might better work under deepspeed and not require cruft (even though it really does not like the masked-NAR, also make the masked-NAR faux-causal since it might better help out for cfg.model.version >= 7
2025-02-23 17:23:24 -06:00
pattern.py
sampler.py
tweaks to bucket sampling
2024-11-13 11:09:24 -06:00
trainer.py
added muon optimizer through kludge hacks because it necessitates a second optimizer in tandum that seems to only sometimes work with deepspeed
2025-02-23 11:22:13 -06:00
utils.py
added muon optimizer through kludge hacks because it necessitates a second optimizer in tandum that seems to only sometimes work with deepspeed
2025-02-23 11:22:13 -06:00