Commit Graph

20 Commits

Author SHA1 Message Date
Dmitry Baranchuk
843ad0631c
Merge pull request #1 from TimDettmers/main
Update main branch
2022-09-10 19:33:21 -07:00
dbaranchuk
8d34d36f15 req_gradA for casted & more efficient and accurate fp16 backward 2022-08-29 00:56:08 +03:00
dbaranchuk
b3fee1ed6a add dtype <-> fp16 cast 2022-08-26 04:11:40 +03:00
dbaranchuk
4d6174bc63 memory efficient fp16 backward 2022-08-25 19:09:23 +03:00
Max Ryabinin
9fc0ab415c Remove unused code 2022-08-24 18:43:18 +03:00
dbaranchuk
876387dc0c minor fixes 2022-08-24 01:12:48 +03:00
dbaranchuk
656de8ed11 minor fixes 2022-08-23 23:53:43 +03:00
dbaranchuk
1753aa0418 refactoring 2022-08-23 23:51:00 +03:00
dbaranchuk
8ae9bb23ad add memory efficient backward 2022-08-23 23:39:54 +03:00
Tim Dettmers
de354f7ded Added fused bias to matmullt. 2022-08-16 12:00:54 -07:00
Tim Dettmers
f9cbe2fe99 Fixed prod Python < 3.7 compatibility in function.py. 2022-08-08 09:13:22 -07:00
Tim Dettmers
62441815bc Removed prod for Python <= 3.7 compatibility. 2022-08-08 05:20:36 -07:00
Tim Dettmers
758c7175a2 Merge branch 'debug' into cuda-bin-switch-and-cli 2022-08-04 08:03:00 -07:00
Tim Dettmers
cc5b323876 Merge branch 'extract_outliers' into debug 2022-08-04 07:40:48 -07:00
Tim Dettmers
451fd9506e Added fixes for the case that matmullt dim A is zero, e.g. [0, 768]. 2022-08-03 11:54:01 -07:00
Titus von Koeller
ea7c14f8ef reran black with linelength 80 for greater readability 2022-08-01 09:32:47 -07:00
Titus von Koeller
bfa0e33294 ran black and isort for coherent code formatting 2022-08-01 03:31:48 -07:00
Tim Dettmers
389f66ca5a Fixed direct extraction masking. 2022-07-27 01:46:35 -07:00
Tim Dettmers
47a73d94c3 Matmullt with direct outlier extraction for 8-bit inference. 2022-07-26 19:15:35 -07:00
Tim Dettmers
c771b3a75a Most tests passing. 2022-07-22 14:41:05 -07:00