vall-e/vall_e/models/arch
2025-02-23 11:22:13 -06:00
..
attention updated mixtral backend (need this for something else) 2025-01-20 21:50:56 -06:00
__init__.py ugh 2025-02-09 13:02:51 -06:00
bitnet.py
llama.py agony 2025-02-12 00:18:24 -06:00
mamba.py touch ups in docs 2024-12-02 19:10:42 -06:00
mixtral.py oops 2025-01-21 11:59:24 -06:00
retnet.py
transformer.py added muon optimizer through kludge hacks because it necessitates a second optimizer in tandum that seems to only sometimes work with deepspeed 2025-02-23 11:22:13 -06:00