vall-e/vall_e/models
2024-11-01 18:36:44 -05:00
..
arch actually float16(+AMP) and layerskip is bad and will kill the model...... 2024-11-01 18:36:44 -05:00
__init__.py
ar_nar.py third time's the charm (for some reason it escaped me that I should treat early exit loss as an aux_loss to be used with the normal loss, as if I was training a MoE's router) 2024-11-01 12:50:37 -05:00
ar.py
base.py third time's the charm (for some reason it escaped me that I should treat early exit loss as an aux_loss to be used with the normal loss, as if I was training a MoE's router) 2024-11-01 12:50:37 -05:00
experimental.py
lora.py
nar.py