vall-e

mrq/vall-e

History

mrq 387358bc8a fixes for the NAR-len model, and documentation some config options, and a better way to handle resizing modules on state_dict load		2024-07-31 20:35:09 -05:00
..
arch	mamba2-hf using `vasqu/mamba2-torch` because it lets me use mamba2 without triton ops (training with my 4xV100s are not happy with mamba2 because of triton)	2024-06-14 19:42:17 -05:00
__init__.py	sanity cleanup: moved experimental features under its own thing	2024-06-30 10:37:33 -05:00
ar_nar.py	added option to set the causal size (how many tokens to sample per AR step), but requires the model to be trained for this (which explains why recurrent chunk sampling just doesn't work for the retnet tests, obvious in hindsight)	2024-07-30 20:53:51 -05:00
base.py	fixes for the NAR-len model, and documentation some config options, and a better way to handle resizing modules on state_dict load	2024-07-31 20:35:09 -05:00
experimental.py	added what I think is DRY sampling	2024-07-29 19:15:07 -05:00
lora.py	some weird fixes for an equally weird regression with LoRA loading	2024-07-22 20:47:24 -05:00
nar.py	fixes for the NAR-len model, and documentation some config options, and a better way to handle resizing modules on state_dict load	2024-07-31 20:35:09 -05:00