vall-e

mrq/vall-e

History

mrq c2a436d368 somehow between training sessions grad_norm = None even though it worked before		2024-06-02 08:29:27 -05:00
..
emb	DAC just doesn't work well enough......	2024-05-25 11:07:52 -05:00
engines	somehow between training sessions grad_norm = None even though it worked before	2024-06-02 08:29:27 -05:00
ext	backwards compat for old YAMLs with `models`, option to set flash attention 2 for Llama (and derivatives), included `syncdoth/RetNet`s torchscale retnet for shits and grins, etc.	2024-04-16 10:02:31 -05:00
models	added model config option to set KV head count for MQA/GQA instead of MHA for llama-based models (i think its very negligible both ways on such a small model size)	2024-05-31 19:32:37 -05:00
utils	reverted automatically disabling split loss calc, since it seems that it's actually cacling loss on prom causes the oddities, maybe	2024-06-01 12:34:59 -05:00
__init__.py	Rewrite init	2023-08-02 21:53:35 +00:00
__main__.py	deprecate sole AR/NAR model by only keeping the AR+NAR (the beauty of no one using this is that I can break compat as much as I want), add tone token for when I classify my dataset with tone/emotion in the future, some other things	2024-04-15 19:54:32 -05:00
config.py	reverted automatically disabling split loss calc, since it seems that it's actually cacling loss on prom causes the oddities, maybe	2024-06-01 12:34:59 -05:00
data.py	ugh	2024-06-01 10:46:42 -05:00
export.py	cleanup, use deepspeed inferencing pathway if requested	2023-10-09 15:24:04 -05:00
inference.py	DAC just doesn't work well enough......	2024-05-25 11:07:52 -05:00
plot.py	deprecate sole AR/NAR model by only keeping the AR+NAR (the beauty of no one using this is that I can break compat as much as I want), add tone token for when I classify my dataset with tone/emotion in the future, some other things	2024-04-15 19:54:32 -05:00
samplers.py	separated samplers into its own file, don't bother copying the logits back to the GPU after sampling, it's not necessary	2023-10-11 12:25:31 -05:00
train.py	nevermind it breaks training	2024-05-25 18:03:43 -05:00
webui.py	backwards compat for old YAMLs with `models`, option to set flash attention 2 for Llama (and derivatives), included `syncdoth/RetNet`s torchscale retnet for shits and grins, etc.	2024-04-16 10:02:31 -05:00