vall-e

mrq/vall-e

History

mrq 789bb5d11b add an optional label override for model loading (used for easy testing between 12/16/20/24 layered model)		2024-04-13 12:43:35 -05:00
..
emb	added sampling by speaker group name (might be better to de-emphasize the LibriVox/Audiobooks that are in large numbers, and emphasize the smaller pools), log cleanup	2023-10-16 19:30:38 -05:00
engines	added Adagrad (experimenting with it), added 'extended' model size (16 layers instead of 12, experimenting with it)	2024-04-09 22:04:01 -05:00
ext	added FP8 support through `NVIDIA/TransformerEngine`, added RetNet_HF through `syncdoth/RetNet` (as an alternative to branch away from torchscale)	2024-04-08 20:14:51 -05:00
models	added Adagrad (experimenting with it), added 'extended' model size (16 layers instead of 12, experimenting with it)	2024-04-09 22:04:01 -05:00
utils	added Adagrad (experimenting with it), added 'extended' model size (16 layers instead of 12, experimenting with it)	2024-04-09 22:04:01 -05:00
__init__.py	Rewrite init	2023-08-02 21:53:35 +00:00
__main__.py	exposed rolling resp context to the web UI, added passing in language to inferencing command line	2023-10-12 23:21:01 -05:00
config.py	add an optional label override for model loading (used for easy testing between 12/16/20/24 layered model)	2024-04-13 12:43:35 -05:00
data.py	added torchscale XMOE integration (because Mixtral 8x7B seems very promising and I want to see if it works)	2023-12-20 18:45:58 -06:00
export.py	cleanup, use deepspeed inferencing pathway if requested	2023-10-09 15:24:04 -05:00
inference.py	added Mistral (non-Mixtral) backend, useless optimization when not training, proper adjustment of the LR for Prodigyopt through d_coeff (maybe), recurrent sampling for LLaMA/Mistral/Mixtral backends (again, doesn't actually work)	2024-01-31 21:48:36 -06:00
plot.py	added min-x and min-y arguments to plot.py, helper script to download from my existing checkpoint	2023-10-04 19:41:37 -05:00
samplers.py	separated samplers into its own file, don't bother copying the logits back to the GPU after sampling, it's not necessary	2023-10-11 12:25:31 -05:00
train.py	logger broke for some reason, added flag to just tqdm.write instead, make cfg.bitsandbytes.bitnet==True yamls denoted since I'm sure they're not interoperable	2024-03-01 10:32:35 -06:00
webui.py	added torchscale XMOE integration (because Mixtral 8x7B seems very promising and I want to see if it works)	2023-12-20 18:45:58 -06:00