Commit Graph

75 Commits

Author SHA1 Message Date
Tim Dettmers
de53588934 Added Int8 matmul support for all GPUs. Full backward support. 2023-02-01 20:09:31 -08:00
Tim Dettmers
336e24696c CUDASetup only executed once + fixed circular import. 2023-01-02 03:31:43 -08:00
Tim Dettmers
c91f592ad7
Merge branch 'main' into cleanup 2023-01-02 11:19:16 +01:00
Tim Dettmers
eb028e6ebc Fixed k-bit quantization maps. 2022-11-19 07:24:03 -08:00
Tom Aarsen
b104ce3b62
Merge branch 'main' into cleanup 2022-11-17 15:22:29 +01:00
Tim Dettmers
08fa2e7b01 Fixed bug in cpu quant; faster GPU dequant. 2022-11-07 18:06:18 -08:00
Tim Dettmers
e0e697b150 Fixed blockwise test and logic. 2022-11-06 16:36:31 -08:00
Tim Dettmers
6bc2b992be Added blocksizes 2048, 1024, and 512 to blockwise quant. 2022-11-06 16:27:48 -08:00
Tim Dettmers
2f2063bac2 Added k<256 quantile estimate. 2022-11-06 13:05:25 -08:00
Tim Dettmers
98cbc4bc4f Added k-bit fp8 map. 2022-11-06 11:59:37 -08:00
Tim Dettmers
caf1832526 Added k-bit linear quantization. 2022-11-06 11:47:54 -08:00
Tim Dettmers
1efb87d89d Added FP8 quantization map. 2022-11-03 19:49:50 -07:00
Tom Aarsen
7a3c9af05d Sort imports
Via isort
2022-10-27 13:15:21 +02:00
Tom Aarsen
0b078403ee Simplify statements into equivalent, modern variants
via pyupgrade --py37-plus. The changes e.g. are subclassing from object, calling super() with super(ThisClass, self), or old-style syntax formatting.
2022-10-27 13:14:13 +02:00
Tom Aarsen
1eec77d34c Remove trailing whitespace & ensure newline at EOF 2022-10-27 13:11:29 +02:00
Tim Dettmers
a371be302d Added CUDA SETUP instruction generator. 2022-10-25 08:01:19 -07:00
Tim Dettmers
df86625a93 Isolated CUDASetup logging; all tests green. 2022-10-24 11:54:25 -07:00
justheuristic
76ce9aa6da try fp32 2022-09-20 06:51:25 +03:00
Tim Dettmers
292a478716 set threshold 2022-09-20 06:42:05 +03:00
justheuristic
a07825ac31 review 2022-09-20 06:40:36 +03:00
justheuristic
cff3a71599 cast device 2022-09-18 01:26:25 +03:00
justheuristic
32a9a88f98 cast device 2022-09-18 01:26:12 +03:00
justheuristic
01b4c6a048 cast device 2022-09-18 01:25:56 +03:00
justheuristic
e4086a2758 cast device 2022-09-18 01:24:57 +03:00
justheuristic
725cc72993 cast device 2022-09-18 01:24:44 +03:00
justheuristic
28a9313ddc cast before allclose 2022-09-18 01:24:27 +03:00
justheuristic
95dafc6475 cast before allclose 2022-09-18 01:22:31 +03:00
justheuristic
37f805bb44 debug 2022-09-18 01:21:12 +03:00
justheuristic
6a826c41a6 pre-cast 2022-09-18 01:20:34 +03:00
justheuristic
d9b8789818 debug 2022-09-18 01:13:58 +03:00
justheuristic
2cd047e35d run backward 2022-09-18 00:55:53 +03:00
justheuristic
591f60395a add memory efficient backward 2022-09-18 00:52:53 +03:00
justheuristic
f6670329fb bump threshold to 0.21 2022-09-18 00:42:23 +03:00
justheuristic
fa8e07c7c5 more lenient threshold 2022-09-18 00:38:02 +03:00
justheuristic
e35e2c665a cast properly 2022-09-18 00:35:03 +03:00
justheuristic
d9ca0ed905 un-fuse bias 2022-09-17 23:44:28 +03:00
justheuristic
7facedda38 copypaste tolerances 2022-09-17 23:41:40 +03:00
justheuristic
e29c5f5c41 clearer assertions 2022-09-17 23:22:04 +03:00
justheuristic
9379df85d2 check dtypes first 2022-09-17 23:13:23 +03:00
justheuristic
140cdbe876 check dtypes first 2022-09-17 23:12:58 +03:00
justheuristic
a9c7953e0a cast to half before double_quant 2022-09-17 23:10:21 +03:00
justheuristic
469d5a631d test_bf16 2022-09-17 23:06:57 +03:00
Tim Dettmers
c05dd42ddd Fixed cpu blockwise quantization for small input tensors. 2022-09-13 10:37:53 -07:00
Tim Dettmers
19a7adca7a Fixed 2^31 max size issue for cpu blockwise quant. 2022-09-11 11:55:09 -07:00
Tim Dettmers
7e0fb655e1 Some initial code. Needs to be tested. 2022-08-23 13:59:34 -07:00
Tim Dettmers
9d60b3c527 Fixed bug in Linear8bitLt, when the bias is None. 2022-08-17 03:45:57 -07:00
Tim Dettmers
de354f7ded Added fused bias to matmullt. 2022-08-16 12:00:54 -07:00
Tim Dettmers
dede343033 Added fused bias in dequant_mm. 2022-08-16 11:12:09 -07:00
Tim Dettmers
1ed2fa2f21 Removed storage() from get_ptr; added boilerplate for bias dequant_mm. 2022-08-16 10:56:17 -07:00
Tim Dettmers
c472bd56f0 Added the case that all env variables are empty (CUDA docker). 2022-08-05 08:57:52 -07:00