backwards compat for old YAMLs with models , option to set flash attention 2 for Llama (and derivatives), included syncdoth/RetNet s torchscale retnet for shits and grins, etc.
2024-04-16 10:02:31 -05:00 |
Properly pass retention_mask for retnet-HF, attempt to fix recurrent forward for retnet (doesn't work still)
2024-04-14 13:12:50 -05:00 |
added FP8 support through NVIDIA/TransformerEngine , added RetNet_HF through syncdoth/RetNet (as an alternative to branch away from torchscale)
2024-04-08 20:14:51 -05:00 |