vall-e/vall_e
2024-04-14 13:12:50 -05:00
..
emb
engines added Adagrad (experimenting with it), added 'extended' model size (16 layers instead of 12, experimenting with it) 2024-04-09 22:04:01 -05:00
ext Properly pass retention_mask for retnet-HF, attempt to fix recurrent forward for retnet (doesn't work still) 2024-04-14 13:12:50 -05:00
models Properly pass retention_mask for retnet-HF, attempt to fix recurrent forward for retnet (doesn't work still) 2024-04-14 13:12:50 -05:00
utils added Adagrad (experimenting with it), added 'extended' model size (16 layers instead of 12, experimenting with it) 2024-04-09 22:04:01 -05:00
__init__.py
__main__.py
config.py add an optional label override for model loading (used for easy testing between 12/16/20/24 layered model) 2024-04-13 12:43:35 -05:00
data.py added torchscale XMOE integration (because Mixtral 8x7B seems very promising and I want to see if it works) 2023-12-20 18:45:58 -06:00
export.py
inference.py added Mistral (non-Mixtral) backend, useless optimization when not training, proper adjustment of the LR for Prodigyopt through d_coeff (maybe), recurrent sampling for LLaMA/Mistral/Mixtral backends (again, doesn't actually work) 2024-01-31 21:48:36 -06:00
plot.py
samplers.py
train.py logger broke for some reason, added flag to just tqdm.write instead, make cfg.bitsandbytes.bitnet==True yamls denoted since I'm sure they're not interoperable 2024-03-01 10:32:35 -06:00
webui.py added torchscale XMOE integration (because Mixtral 8x7B seems very promising and I want to see if it works) 2023-12-20 18:45:58 -06:00