vall-e

mrq/vall-e

History

mrq 3da1518ace added Mistral (non-Mixtral) backend, useless optimization when not training, proper adjustment of the LR for Prodigyopt through d_coeff (maybe), recurrent sampling for LLaMA/Mistral/Mixtral backends (again, doesn't actually work)		2024-01-31 21:48:36 -06:00
..
__init__.py	added Mistral (non-Mixtral) backend, useless optimization when not training, proper adjustment of the LR for Prodigyopt through d_coeff (maybe), recurrent sampling for LLaMA/Mistral/Mixtral backends (again, doesn't actually work)	2024-01-31 21:48:36 -06:00
adaln.py
ar_nar.py	added Mistral (non-Mixtral) backend, useless optimization when not training, proper adjustment of the LR for Prodigyopt through d_coeff (maybe), recurrent sampling for LLaMA/Mistral/Mixtral backends (again, doesn't actually work)	2024-01-31 21:48:36 -06:00
ar.py	actually use langs from the dataloader	2023-10-11 21:21:50 -05:00
base.py	added Mistral (non-Mixtral) backend, useless optimization when not training, proper adjustment of the LR for Prodigyopt through d_coeff (maybe), recurrent sampling for LLaMA/Mistral/Mixtral backends (again, doesn't actually work)	2024-01-31 21:48:36 -06:00
nar.py	actually use langs from the dataloader	2023-10-11 21:21:50 -05:00
retnet.py	restructured some things with the model to remove dead weights	2023-09-20 19:10:59 -05:00
transformer.py