Commit Graph

86 Commits

Author SHA1 Message Date
justheuristic
85bf5294a6 debug assert 2022-09-18 00:01:25 +03:00
justheuristic
210b9ed9ce debug assert 2022-09-18 00:00:45 +03:00
justheuristic
647c976a74 change order 2022-09-17 23:59:36 +03:00
justheuristic
0de1a4494b change order 2022-09-17 23:53:49 +03:00
justheuristic
e9b87112ee un-fuse bias 2022-09-17 23:51:28 +03:00
justheuristic
56a074f6dc un-fuse bias 2022-09-17 23:46:37 +03:00
justheuristic
d9ca0ed905 un-fuse bias 2022-09-17 23:44:28 +03:00
justheuristic
eac9aca460 cast bias too 2022-09-17 23:38:09 +03:00
justheuristic
a9fe0ff98c recast to fp16 2022-09-17 23:34:22 +03:00
justheuristic
fc4a135ed1 clearer assertions 2022-09-17 23:24:26 +03:00
justheuristic
cc4858c2fd some kind of warning or something when this is first executed to make people aware that a cast happens and the operation quantization is performed in fp16. 2022-09-17 20:46:04 +03:00
dbaranchuk
e2a75769f2 bug fix 2022-09-11 21:41:46 +03:00
dbaranchuk
4dd475ced4 refactoring 2022-09-11 06:28:17 +03:00
dbaranchuk
d358999e9e refactoring 2022-09-11 06:26:15 +03:00
dbaranchuk
ee325f0215 clarified an exception message 2022-09-11 06:18:44 +03:00
dbaranchuk
42b5fc9acc add memory effcient backward option 2022-09-11 05:51:29 +03:00
Dmitry Baranchuk
843ad0631c
Merge pull request #1 from TimDettmers/main
Update main branch
2022-09-10 19:33:21 -07:00
dbaranchuk
8d34d36f15 req_gradA for casted & more efficient and accurate fp16 backward 2022-08-29 00:56:08 +03:00
dbaranchuk
b3fee1ed6a add dtype <-> fp16 cast 2022-08-26 04:11:40 +03:00
dbaranchuk
4d6174bc63 memory efficient fp16 backward 2022-08-25 19:09:23 +03:00
Max Ryabinin
9fc0ab415c Remove unused code 2022-08-24 18:43:18 +03:00
dbaranchuk
876387dc0c minor fixes 2022-08-24 01:12:48 +03:00
dbaranchuk
656de8ed11 minor fixes 2022-08-23 23:53:43 +03:00
dbaranchuk
1753aa0418 refactoring 2022-08-23 23:51:00 +03:00
dbaranchuk
8ae9bb23ad add memory efficient backward 2022-08-23 23:39:54 +03:00
Tim Dettmers
de354f7ded Added fused bias to matmullt. 2022-08-16 12:00:54 -07:00
Tim Dettmers
f9cbe2fe99 Fixed prod Python < 3.7 compatibility in function.py. 2022-08-08 09:13:22 -07:00
Tim Dettmers
62441815bc Removed prod for Python <= 3.7 compatibility. 2022-08-08 05:20:36 -07:00
Tim Dettmers
758c7175a2 Merge branch 'debug' into cuda-bin-switch-and-cli 2022-08-04 08:03:00 -07:00
Tim Dettmers
cc5b323876 Merge branch 'extract_outliers' into debug 2022-08-04 07:40:48 -07:00
Tim Dettmers
451fd9506e Added fixes for the case that matmullt dim A is zero, e.g. [0, 768]. 2022-08-03 11:54:01 -07:00
Titus von Koeller
ea7c14f8ef reran black with linelength 80 for greater readability 2022-08-01 09:32:47 -07:00
Titus von Koeller
bfa0e33294 ran black and isort for coherent code formatting 2022-08-01 03:31:48 -07:00
Tim Dettmers
389f66ca5a Fixed direct extraction masking. 2022-07-27 01:46:35 -07:00
Tim Dettmers
47a73d94c3 Matmullt with direct outlier extraction for 8-bit inference. 2022-07-26 19:15:35 -07:00
Tim Dettmers
c771b3a75a Most tests passing. 2022-07-22 14:41:05 -07:00