|
ebac1db16c
|
maybe final tweaks, I really needed to unify my json read/write and orjson is proven to be fast enough for me to try and rely on it more
|
2024-09-17 22:57:04 -05:00 |
|
|
c93d5863fd
|
fixes
|
2024-06-04 00:07:00 -05:00 |
|
|
aa1e25fbf5
|
backwards compat for old YAMLs with models , option to set flash attention 2 for Llama (and derivatives), included syncdoth/RetNet s torchscale retnet for shits and grins, etc.
|
2024-04-16 10:02:31 -05:00 |
|
|
d69a00e389
|
Properly pass retention_mask for retnet-HF, attempt to fix recurrent forward for retnet (doesn't work still)
|
2024-04-14 13:12:50 -05:00 |
|
|
9d97eb5104
|
added FP8 support through NVIDIA/TransformerEngine , added RetNet_HF through syncdoth/RetNet (as an alternative to branch away from torchscale)
|
2024-04-08 20:14:51 -05:00 |
|
|
2f9cd0842f
|
merged dedicated interleaved AR code with the normal AR code
|
2023-09-03 22:46:08 -05:00 |
|
|
c56ce033d9
|
work on an interleaved AR (spoiler: it does not work)
|
2023-09-03 21:27:58 -05:00 |
|