|
99be487482
|
backported old fork features (kv_cache (which looking back seems like a spook), ddim sampling, etc)
|
2024-06-19 14:49:24 -05:00 |
|
|
268ba17485
|
crammed in HF attention selection mechanisms for the AR
|
2024-06-19 10:21:43 -05:00 |
|
|
7aae9d48ab
|
training + LoRA training works? (keeps OOMing after a step)
|
2024-06-18 13:28:50 -05:00 |
|
|
37ec9f1b79
|
initial "refractoring"
|
2024-06-17 22:48:34 -05:00 |
|