vall-e

mrq/vall-e

History

mrq ec25f56bd9 used torch.max fixes things, somehow, for dynamic temp sampling		2023-10-10 16:42:24 -05:00
..
__init__.py	added option to use SGD optimizer through the YAML, added option to pass in additional optimizer parameters through the YAML, added experimental unified AR+NAR model (does not seem fruitful in testing)	2023-09-06 18:58:35 -05:00
adaln.py	Tweaks	2023-08-02 22:06:39 +00:00
ar_nar.py	trim the input prompt to 3 seconds when training NAR tasks (marked as experimental; the paper mentions doing so, but I don't know how much this would harm the retention heads)	2023-10-09 22:03:58 -05:00
ar.py	restructured some things with the model to remove dead weights	2023-09-20 19:10:59 -05:00
base.py	used torch.max fixes things, somehow, for dynamic temp sampling	2023-10-10 16:42:24 -05:00
nar.py	trim the input prompt to 3 seconds when training NAR tasks (marked as experimental; the paper mentions doing so, but I don't know how much this would harm the retention heads)	2023-10-09 22:03:58 -05:00
retnet.py	restructured some things with the model to remove dead weights	2023-09-20 19:10:59 -05:00
transformer.py	added ability to disable activation checkpointing through the YAML (it is very VRAM intensive at double layer size)	2023-09-05 15:38:21 -05:00