Commit Graph

145 Commits

Author SHA1 Message Date
Tim Dettmers
1b8772a8f3 Added PagedLion and bf16 Lion. 2023-05-23 19:37:38 -07:00
Tim Dettmers
2bce175d15 Fixed Makefile. 2023-05-23 18:42:19 -07:00
Tim Dettmers
4bd1151829 Fixed gradient accumulation test. 2023-05-07 15:06:17 -07:00
Tim Dettmers
675baa79d2 Merge remote-tracking branch 'origin/main' into merge 2023-05-07 13:34:03 -07:00
Tim Dettmers
f64cfe65aa Fixed prefetch bug for non-paged tensors; added benchmark. 2023-05-06 21:49:16 -07:00
Tim Dettmers
44d68ff29c Added paged optimizers. 2023-05-06 14:59:29 -07:00
Tim Dettmers
ec38ba95b0 Added paging. 2023-05-06 11:14:06 -07:00
Tim Dettmers
264a948539 4-bit draft; 128 vector load 240. 2023-05-02 16:15:38 -07:00
Tim Dettmers
869b7e83b5 Warp multi-specialization 240. 2023-05-02 12:10:32 -07:00
Tim Dettmers
77f15fdce9 Shared memory efficient 240. 2023-05-02 11:38:11 -07:00
Tim Dettmers
394749db71 Correct implementation 240. 2023-05-02 08:58:59 -07:00
Tim Dettmers
9aa232cc39 Initial. 2023-05-02 07:53:29 -07:00
Tim Dettmers
9192c9de64 Tighter and scaled error analysis. 2023-05-02 07:50:32 -07:00
Tim Dettmers
f9bfea8f23 Baseline for debugging. 2023-05-02 07:24:12 -07:00
Tim Dettmers
7cc8ff4727 Warp specalization 362. 2023-05-01 08:21:12 -07:00
Tim Dettmers
c35ed09b66 Double frag 440. 2023-04-30 18:19:30 -07:00
Tim Dettmers
ad07d254fb Slow tensor core solution. 2023-04-30 17:43:02 -07:00
Tim Dettmers
21723f796a 4-bit draft. 2023-04-29 21:52:47 -07:00
Tim Dettmers
cad839941b Added bit template. 2023-04-28 22:10:42 -07:00
Tim Dettmers
f3e97ccbd2 New implementation for batch size 1. 2023-04-28 21:29:40 -07:00
Tim Dettmers
f6df4aef6a Added fp16 and thread/item template. 2023-04-28 18:26:52 -07:00
Tim Dettmers
c1bfb210c5 First baseline kernel. 2023-04-28 17:19:02 -07:00
Tim Dettmers
9cab14a3ff Adedd pipeline draft. 2023-04-27 15:12:49 -07:00
Tim Dettmers
d1c4c20568 Added non-cutlass template. 2023-04-27 15:11:26 -07:00
Tim Dettmers
0afc8e9e2f Best attempt at cutlass3. 2023-04-26 17:12:34 -07:00
Tim Dettmers
0f9d30207f Added nested quantization for blockwise quantization. 2023-04-19 11:48:47 -07:00
Tim Dettmers
7dc198feb7 Added 32-bit optimizer for bfloat16 gradients. 2023-04-17 18:01:49 -07:00
Tim Dettmers
9e7cdc9ea9 Added last SwitchBack refactors. All tests green. 2023-04-12 13:41:30 -07:00
Tim Dettmers
7140c01405 Merge branch 'main' into fp8_merge 2023-04-12 11:44:39 -07:00
Tim Dettmers
dd562c24f1 Refactored simulated fp8 modules into research.nn. 2023-04-12 11:24:44 -07:00
Tim Dettmers
ec1ea63711 Refactored triton into its own folder. Refactored fp8 matmuls. 2023-04-12 09:39:39 -07:00
Tim Dettmers
4cd63deff3 Fixed CUDA Conda PyTorch 2.0 issues. 2023-04-11 12:10:20 -07:00
Tim Dettmers
2eb3108356 Fixed bug where beta2 was not passed into Lion 32-bit. 2023-04-11 09:16:01 -07:00
Tim Dettmers
792af5c883 Fixed noisy tests for 8-bit Lion. 2023-04-11 08:42:41 -07:00
Tim Dettmers
ed6f3eb146
Merge pull request #159 from TimDettmers/serialize_8bit
Implement proper serialization of Linear8bitLt
2023-04-11 07:24:51 -07:00
Tim Dettmers
e9fa03b717 Some fixed for loading PEFT modules with Params4bit. 2023-04-07 09:59:21 -07:00
Tim Dettmers
1ccb7bdec6 Fixed ParamsIn4 init; fixed PyTorch 2.0 test failure. 2023-04-03 18:47:00 -07:00
Tim Dettmers
4ea489d3bf Refactor FP4 into 4Bit and integrate NF4 data type. 2023-04-03 11:00:12 -07:00
Tim Dettmers
64cc05920d First draft of NF4. 2023-04-02 16:10:35 -07:00
Tim Dettmers
4ad999d144 Added quantization tree generation. 2023-04-02 14:42:45 -07:00
Tim Dettmers
0d332a641f Added normal with extra value. 2023-04-02 14:09:08 -07:00
Tim Dettmers
2dd5d69056 Generalized FP4 data type. 2023-04-02 12:42:01 -07:00
Tim Dettmers
51a21df728 Added 8-bit compression to quantization statistics. 2023-04-01 16:10:18 -07:00
Mitchell Wortsman
7f87ba83ee cleaning and refactor 2023-04-01 18:46:04 +00:00
Tim Dettmers
c4cfe4fbdd Added bf16 Adam. 2023-04-01 10:33:03 -07:00
Tim Dettmers
30d21d585c Added triton test. 2023-03-31 11:33:26 -07:00
Tim Dettmers
a13a522c4c Added first triton test. 2023-03-31 11:20:54 -07:00
Tim Dettmers
8645d1f71c Added normal quant. 2023-03-29 18:41:37 -07:00
Mitchell Wortsman
b373034e31 test 2023-03-29 19:04:53 +00:00
Mitchell Wortsman
5f3d9ada8d triton-v1 2023-03-29 06:47:08 +00:00