vall-e

mrq/vall-e

History

mrq 4456d3172b that's what I get for testing without hdf5 on my previous machine....		2024-08-02 20:44:01 -05:00
..
emb	added prom-less training / inferencing, some other things	2024-07-22 19:36:07 -05:00
engines	oversight with using resize_modules	2024-08-02 20:28:49 -05:00
ext
models	ugh, finally got some form of offloading working (need to test if it works on different GPUs, but GPU and CPU offloading seems to work in the test trainer)	2024-08-01 22:43:39 -05:00
utils	oversight with using resize_modules	2024-08-02 20:28:49 -05:00
__init__.py
__main__.py	added option to set the causal size (how many tokens to sample per AR step), but requires the model to be trained for this (which explains why recurrent chunk sampling just doesn't work for the retnet tests, obvious in hindsight)	2024-07-30 20:53:51 -05:00
config.py	it actually wasn't working because Engines.__init__() automatically moves the entire module to the requested device, which was being called after offloading the model in the test trainer (and it seems I cant do it without injecting a bunch of shit in modeling_llama.py)	2024-08-01 20:56:28 -05:00
data.py	that's what I get for testing without hdf5 on my previous machine....	2024-08-02 20:44:01 -05:00
demo.py
export.py	fix weird regression in handling checkpoints when backend is local, but deepspeed checkpoints are in (it was handled with LoRA loading but not real loading...)	2024-07-30 22:15:56 -05:00
inference.py	fix weird regression in handling checkpoints when backend is local, but deepspeed checkpoints are in (it was handled with LoRA loading but not real loading...)	2024-07-30 22:15:56 -05:00
plot.py
samplers.py	possible speedup for samplers that require a list of previous tokens (the DRY sampler made me realize that I should copy the tolist() thing from the rep pen sampler for everything else)	2024-07-29 20:23:26 -05:00
train.py	sanity cleanups with weird off-by-one-ness, cleaned up and validated vall_e.models.experimental works again	2024-07-27 15:36:05 -05:00
webui.py	added option to set the causal size (how many tokens to sample per AR step), but requires the model to be trained for this (which explains why recurrent chunk sampling just doesn't work for the retnet tests, obvious in hindsight)	2024-07-30 20:53:51 -05:00