vall-e

mrq/vall-e

History

mrq 9c198eb75a added torchscale XMOE integration (because Mixtral 8x7B seems very promising and I want to see if it works)		2023-12-20 18:45:58 -06:00
..
__init__.py	tweaks to try and get deepspeed quantized inferencing, validating bitsandbytes and deepspeed quantization, nothing seems to work	2023-10-12 22:21:43 -05:00
adaln.py	Tweaks	2023-08-02 22:06:39 +00:00
ar_nar.py	added torchscale XMOE integration (because Mixtral 8x7B seems very promising and I want to see if it works)	2023-12-20 18:45:58 -06:00
ar.py	actually use langs from the dataloader	2023-10-11 21:21:50 -05:00
base.py	added torchscale XMOE integration (because Mixtral 8x7B seems very promising and I want to see if it works)	2023-12-20 18:45:58 -06:00
nar.py	actually use langs from the dataloader	2023-10-11 21:21:50 -05:00
retnet.py	restructured some things with the model to remove dead weights	2023-09-20 19:10:59 -05:00
transformer.py	added ability to disable activation checkpointing through the YAML (it is very VRAM intensive at double layer size)	2023-09-05 15:38:21 -05:00