Mike Ranzinger
|
29c6eadb83
|
Masks are now optional, and not created. Fixes required to support FlashAttention (e.g. no mask, fp/bf16)
|
2023-05-09 19:21:25 +00:00 |
|
Wenhui Wang
|
599df73687
|
b3 incremental decoding
|
2023-03-09 12:02:36 +08:00 |
|
shumingma
|
0cb9695501
|
Remove inplace
|
2023-01-18 22:44:26 -08:00 |
|
shumingma
|
9f105b591d
|
Support Pytorch LayerNorm
|
2023-01-16 20:17:28 -08:00 |
|
shumingma
|
1a5d2c26fe
|
Batch size first
|
2023-01-05 01:19:51 -08:00 |
|
shumingma
|
aa36203042
|
Fix multiway checkpointing
|
2022-12-27 22:32:02 -08:00 |
|
shumingma
|
7eca1a531c
|
Code reformatting
|
2022-11-26 09:01:02 -08:00 |
|
shumingma
|
65fe50f466
|
update copyright
|
2022-11-23 08:36:55 -08:00 |
|
shumingma
|
ede048831f
|
torchscale released
|
2022-11-23 08:21:58 -08:00 |
|