vall-e/vall_e/models
2024-11-01 18:36:44 -05:00
..
arch actually float16(+AMP) and layerskip is bad and will kill the model...... 2024-11-01 18:36:44 -05:00
__init__.py added option to load from a model state dict directly instead of a yaml (to-do: do this for LoRAs too), automatically download the default model if none is provided 2024-10-25 22:15:15 -05:00
ar_nar.py third time's the charm (for some reason it escaped me that I should treat early exit loss as an aux_loss to be used with the normal loss, as if I was training a MoE's router) 2024-11-01 12:50:37 -05:00
ar.py added prefixing with silence (was to test something, currently hidden under cfg.experimental=True) 2024-10-18 17:19:52 -05:00
base.py third time's the charm (for some reason it escaped me that I should treat early exit loss as an aux_loss to be used with the normal loss, as if I was training a MoE's router) 2024-11-01 12:50:37 -05:00
experimental.py moved prints to use logger, edited readme (fused_attn doesnt seem stable for training) 2024-08-29 13:27:16 -05:00
lora.py
nar.py layer skip training implemented (need to gut the inferencing from the repo, and to actually see if the model can benefit from this) 2024-10-30 20:05:45 -05:00