This website requires JavaScript.
Explore
Help
Register
Sign In
ecker
/
vall-e
Watch
5
Star
9
Fork
0
You've already forked vall-e
Code
Issues
8
Pull Requests
Packages
Projects
Releases
Wiki
Activity
fb8faa295b
vall-e
/
vall_e
/
models
History
mrq
fb8faa295b
actually float16(+AMP) and layerskip is bad and will kill the model......
2024-11-01 18:36:44 -05:00
..
arch
actually float16(+AMP) and layerskip is bad and will kill the model......
2024-11-01 18:36:44 -05:00
__init__.py
added option to load from a model state dict directly instead of a yaml (to-do: do this for LoRAs too), automatically download the default model if none is provided
2024-10-25 22:15:15 -05:00
ar_nar.py
third time's the charm (for some reason it escaped me that I should treat early exit loss as an aux_loss to be used with the normal loss, as if I was training a MoE's router)
2024-11-01 12:50:37 -05:00
ar.py
added prefixing with silence (was to test something, currently hidden under cfg.experimental=True)
2024-10-18 17:19:52 -05:00
base.py
third time's the charm (for some reason it escaped me that I should treat early exit loss as an aux_loss to be used with the normal loss, as if I was training a MoE's router)
2024-11-01 12:50:37 -05:00
experimental.py
moved prints to use logger, edited readme (fused_attn doesnt seem stable for training)
2024-08-29 13:27:16 -05:00
lora.py
nar.py
layer skip training implemented (need to gut the inferencing from the repo, and to actually see if the model can benefit from this)
2024-10-30 20:05:45 -05:00