vall-e

History

mrq ab673e0426 add cap for NAR-len training, to avoid any weird cases in early training where it'll just mess up and generate long lengths		2024-08-03 21:00:32 -05:00
..
emb
engines	tweaks for the NAR-len model, maybe	2024-08-03 08:40:39 -05:00
ext
models	more coping with the NAR len	2024-08-03 20:23:36 -05:00
utils	add cap for NAR-len training, to avoid any weird cases in early training where it'll just mess up and generate long lengths	2024-08-03 21:00:32 -05:00
__init__.py
__main__.py	added option to set the causal size (how many tokens to sample per AR step), but requires the model to be trained for this (which explains why recurrent chunk sampling just doesn't work for the retnet tests, obvious in hindsight)	2024-07-30 20:53:51 -05:00
config.py	some cleanup, fixed the wrapper attention to explicitly use other sdpa backends	2024-08-03 19:51:00 -05:00
data.py	fixes, throw an exception when using NAR only model with non-unified position IDs, since for some reason it outputs garbage for the NAR	2024-08-02 22:25:49 -05:00
demo.py
export.py	fix weird regression in handling checkpoints when backend is local, but deepspeed checkpoints are in (it was handled with LoRA loading but not real loading...)	2024-07-30 22:15:56 -05:00
inference.py	fix weird regression in handling checkpoints when backend is local, but deepspeed checkpoints are in (it was handled with LoRA loading but not real loading...)	2024-07-30 22:15:56 -05:00
plot.py
samplers.py
train.py	add cap for NAR-len training, to avoid any weird cases in early training where it'll just mess up and generate long lengths	2024-08-03 21:00:32 -05:00
webui.py	added option to set the causal size (how many tokens to sample per AR step), but requires the model to be trained for this (which explains why recurrent chunk sampling just doesn't work for the retnet tests, obvious in hindsight)	2024-07-30 20:53:51 -05:00