vall-e/vall_e
2024-06-01 09:29:49 -05:00
..
emb DAC just doesn't work well enough...... 2024-05-25 11:07:52 -05:00
engines nan loss detection (should have added it earlier), loss scaling for local backend + fp16 2024-05-11 22:23:29 -05:00
ext backwards compat for old YAMLs with models, option to set flash attention 2 for Llama (and derivatives), included syncdoth/RetNets torchscale retnet for shits and grins, etc. 2024-04-16 10:02:31 -05:00
models added model config option to set KV head count for MQA/GQA instead of MHA for llama-based models (i think its very negligible both ways on such a small model size) 2024-05-31 19:32:37 -05:00
utils some cleanup 2024-05-25 17:46:52 -05:00
__init__.py Rewrite init 2023-08-02 21:53:35 +00:00
__main__.py deprecate sole AR/NAR model by only keeping the AR+NAR (the beauty of no one using this is that I can break compat as much as I want), add tone token for when I classify my dataset with tone/emotion in the future, some other things 2024-04-15 19:54:32 -05:00
config.py actually don't default to compute split losses, test bitnet model doesn't seem to be doing things right (despite debug printouts showing theyre roughly the same logit/loss sequences, could just be bitnet linears being not up to par on actual models) 2024-06-01 09:12:51 -05:00
data.py split sampler dict by global_rank, also handle splitting dataset paths by global_rank if sampler_type == path (because I do not trust DistributedSampler) (need to test) 2024-06-01 09:29:49 -05:00
export.py cleanup, use deepspeed inferencing pathway if requested 2023-10-09 15:24:04 -05:00
inference.py DAC just doesn't work well enough...... 2024-05-25 11:07:52 -05:00
plot.py deprecate sole AR/NAR model by only keeping the AR+NAR (the beauty of no one using this is that I can break compat as much as I want), add tone token for when I classify my dataset with tone/emotion in the future, some other things 2024-04-15 19:54:32 -05:00
samplers.py separated samplers into its own file, don't bother copying the logits back to the GPU after sampling, it's not necessary 2023-10-11 12:25:31 -05:00
train.py nevermind it breaks training 2024-05-25 18:03:43 -05:00
webui.py backwards compat for old YAMLs with models, option to set flash attention 2 for Llama (and derivatives), included syncdoth/RetNets torchscale retnet for shits and grins, etc. 2024-04-16 10:02:31 -05:00