This website requires JavaScript.
Explore
Help
Register
Sign In
mrq
/
vall-e
Watch
5
Star
9
Fork
0
You've already forked vall-e
Code
Issues
8
Pull Requests
Packages
Projects
Releases
Wiki
Activity
e513d2ef19
vall-e
/
vall_e
/
models
History
mrq
e513d2ef19
experts weren't forwarded into constructer (wasted a few days of training garbage)
2023-12-23 16:08:17 -06:00
..
__init__.py
experts weren't forwarded into constructer (wasted a few days of training garbage)
2023-12-23 16:08:17 -06:00
adaln.py
ar_nar.py
added LLaMA/Mixtral (if experts>1) model arches, utilize XMoE's loss as well, set MoE frequency to 1 to make every layer MoE'd for RetNet, etc. (going to do tests without burning out again to see how things go)
2023-12-22 19:27:36 -06:00
ar.py
actually use langs from the dataloader
2023-10-11 21:21:50 -05:00
base.py
added LLaMA/Mixtral (if experts>1) model arches, utilize XMoE's loss as well, set MoE frequency to 1 to make every layer MoE'd for RetNet, etc. (going to do tests without burning out again to see how things go)
2023-12-22 19:27:36 -06:00
nar.py
actually use langs from the dataloader
2023-10-11 21:21:50 -05:00
retnet.py
restructured some things with the model to remove dead weights
2023-09-20 19:10:59 -05:00
transformer.py