Commit Graph

266 Commits

Author SHA1 Message Date
Tim Dettmers
1341fb44ad Fixed issue where the CUDA SETUP was not printed. 2023-01-04 03:50:53 -08:00
Tim Dettmers
b3de19218e Added error message for unexpected CUDA exception. 2023-01-03 06:57:07 -08:00
Tim Dettmers
81990491ff
Merge pull request #113 from Borzik/fix-warnings
Import missing warn function
2023-01-03 15:46:58 +01:00
Tim Dettmers
9180b4cc11 Added additional error message for cudart error #85 2023-01-03 06:44:11 -08:00
Tim Dettmers
211ad594df Added error+instructions for unsupported CUDA 10.0 version #82 2023-01-03 06:07:35 -08:00
Felix Borzik
f3800bab75 import warn function 2023-01-03 13:23:34 +00:00
Tim Dettmers
9d353ca786
Merge pull request #87 from lostmsu/main
Add `device` and `dtype` parameters to `StableEmbedding`
2023-01-02 13:22:45 +01:00
Tim Dettmers
7a6563b6c8 Default to CPU library on CUDA error+small refactor. 2023-01-02 03:47:09 -08:00
Tim Dettmers
d9112dc55b
Merge pull request #110 from BlackHC/cublaslt_version
Improve cc version detection for cublaslt
2023-01-02 12:35:53 +01:00
Tim Dettmers
336e24696c CUDASetup only executed once + fixed circular import. 2023-01-02 03:31:43 -08:00
Tim Dettmers
be5cecb88f
Merge branch 'main' into main 2023-01-02 11:23:17 +01:00
Tim Dettmers
c91f592ad7
Merge branch 'main' into cleanup 2023-01-02 11:19:16 +01:00
blackhc
ed17aa9a31 Don't mark it as failure though. 2022-12-29 23:50:48 +00:00
blackhc
7b39a5511d Fix issue #97 2022-12-29 23:47:21 +00:00
Tim Dettmers
c059bd2848 Added additional blocksizes: {64, 128, 256}. 2022-11-20 14:18:15 -08:00
Tim Dettmers
eb028e6ebc Fixed k-bit quantization maps. 2022-11-19 07:24:03 -08:00
Tom Aarsen
b104ce3b62
Merge branch 'main' into cleanup 2022-11-17 15:22:29 +01:00
Tim Dettmers
08fa2e7b01 Fixed bug in cpu quant; faster GPU dequant. 2022-11-07 18:06:18 -08:00
Tim Dettmers
62a333ac40 Added pre/post calls do quantize_blockwise. 2022-11-06 17:17:51 -08:00
Tim Dettmers
e0e697b150 Fixed blockwise test and logic. 2022-11-06 16:36:31 -08:00
Tim Dettmers
6bc2b992be Added blocksizes 2048, 1024, and 512 to blockwise quant. 2022-11-06 16:27:48 -08:00
Tim Dettmers
2f2063bac2 Added k<256 quantile estimate. 2022-11-06 13:05:25 -08:00
Tim Dettmers
98cbc4bc4f Added k-bit fp8 map. 2022-11-06 11:59:37 -08:00
Tim Dettmers
caf1832526 Added k-bit linear quantization. 2022-11-06 11:47:54 -08:00
Victor Nova
62d39a237c
add device and dtype parameters to StableEmbedding 2022-11-04 14:12:46 -07:00
Tim Dettmers
1efb87d89d Added FP8 quantization map. 2022-11-03 19:49:50 -07:00
Tom Aarsen
62c0bd2278 Fix several typos in logging and comments
Via codespell
2022-11-01 09:53:47 +01:00
Tom Aarsen
d504050ff7 Call isort over cuda_setup/main.py 2022-11-01 09:46:03 +01:00
Tom Aarsen
30f28b94a0
Merge branch 'main' into cleanup 2022-11-01 09:43:49 +01:00
Tim Dettmers
8d87c0b852 Fixed CUDA setup bugs, including #81. 2022-10-31 18:04:49 -07:00
adpkadspokasdk
8724c990c7 allow hiding of the welcome message 2022-10-27 16:04:49 -06:00
Tim Dettmers
4844aef4ff Fixing bad error when GPU was not detected for #73. 2022-10-27 08:54:30 -07:00
Tom Aarsen
c6dad28a08 Remove extraneous get_ptr calls 2022-10-27 13:53:16 +02:00
Tom Aarsen
7727fa4c8c Remove f-prefix from strings that don't use formatting 2022-10-27 13:36:39 +02:00
Tom Aarsen
54bd6ed1d6 Remove unused imports 2022-10-27 13:32:01 +02:00
Tom Aarsen
ef70f2adcd Fix bad indentation 2022-10-27 13:27:18 +02:00
Tom Aarsen
697bd02c60 Resolve dangerous default value [] as argument 2022-10-27 13:25:51 +02:00
Tom Aarsen
b5cf706341 Removing unnecessary else's 2022-10-27 13:25:07 +02:00
Tom Aarsen
4a05df34c2 Fix critical bug in PytorchLARS().step: Undefined variable 2022-10-27 13:19:09 +02:00
Tom Aarsen
f6978ae2a2 Fix critical bug in histogram_scatter_add_2d: Undefined variable 2022-10-27 13:16:53 +02:00
Tom Aarsen
7a3c9af05d Sort imports
Via isort
2022-10-27 13:15:21 +02:00
Tom Aarsen
0b078403ee Simplify statements into equivalent, modern variants
via pyupgrade --py37-plus. The changes e.g. are subclassing from object, calling super() with super(ThisClass, self), or old-style syntax formatting.
2022-10-27 13:14:13 +02:00
Tom Aarsen
1eec77d34c Remove trailing whitespace & ensure newline at EOF 2022-10-27 13:11:29 +02:00
Tom Aarsen
31f6689504 Remove references to unused cli 2022-10-27 13:10:32 +02:00
Tom Aarsen
4faf6cb7e9 Replace seemingly incorrect use of CUDA_RUNTIME_LIB 2022-10-26 09:43:57 +02:00
Tom Aarsen
c584482f1f Resolve cases of CUDASetup.get_instance not being called when used 2022-10-26 09:37:16 +02:00
Tim Dettmers
a371be302d Added CUDA SETUP instruction generator. 2022-10-25 08:01:19 -07:00
Tim Dettmers
df86625a93 Isolated CUDASetup logging; all tests green. 2022-10-24 11:54:25 -07:00
Tim Dettmers
ed2e3b9db4
Merge pull request #36 from tomaarsen/hotfix/os_error_name_too_long
Fixes `OSError: File name too long` when environment variable is too long
2022-10-09 16:47:11 -07:00
Tim Dettmers
76699b4a8d
Merge pull request #37 from tomaarsen/hotfix/colab_just_cpu
Perform check using implicit list length
2022-10-09 16:43:58 -07:00
Tim Dettmers
9b7d307b8c review 2022-09-20 06:36:32 +03:00
justheuristic
5d65817101 debug 2022-09-18 01:09:24 +03:00
justheuristic
4da2227fcb debug 2022-09-18 01:03:21 +03:00
justheuristic
4b4a9effd1 debugprint 2022-09-18 01:02:13 +03:00
justheuristic
7906dc4c9a debugpritn 2022-09-18 00:57:26 +03:00
justheuristic
591f60395a add memory efficient backward 2022-09-18 00:52:53 +03:00
justheuristic
579b8c782f reduce diff 2022-09-18 00:47:58 +03:00
justheuristic
76ece2c126 rollback 2022-09-18 00:43:56 +03:00
justheuristic
18f142e268 addmm_ 2022-09-18 00:43:02 +03:00
justheuristic
ab9dee062d cast edge case 2022-09-18 00:36:46 +03:00
justheuristic
cbfdf0b5ef cast edge case 2022-09-18 00:35:42 +03:00
justheuristic
e35e2c665a cast properly 2022-09-18 00:35:03 +03:00
justheuristic
577275bd8c cast properly 2022-09-18 00:30:57 +03:00
justheuristic
45dc1983e9 cast properly 2022-09-18 00:28:03 +03:00
justheuristic
702cc72018 debug asset 2022-09-18 00:26:46 +03:00
justheuristic
a214824f93 matmul -1- addmm 2022-09-18 00:24:59 +03:00
justheuristic
14048a3c16 safer cast 2022-09-18 00:24:20 +03:00
justheuristic
5b169f18e4 change typecast behavior 2022-09-18 00:21:15 +03:00
justheuristic
1da4880262 change typecast behavior 2022-09-18 00:19:22 +03:00
justheuristic
1145589f84 change typecast behavior 2022-09-18 00:15:57 +03:00
justheuristic
d6e25b5f5e change typecast behavior 2022-09-18 00:15:18 +03:00
justheuristic
e2b523d071 change typecast behavior 2022-09-18 00:07:05 +03:00
justheuristic
85bf5294a6 debug assert 2022-09-18 00:01:25 +03:00
justheuristic
210b9ed9ce debug assert 2022-09-18 00:00:45 +03:00
justheuristic
647c976a74 change order 2022-09-17 23:59:36 +03:00
justheuristic
0de1a4494b change order 2022-09-17 23:53:49 +03:00
justheuristic
e9b87112ee un-fuse bias 2022-09-17 23:51:28 +03:00
justheuristic
56a074f6dc un-fuse bias 2022-09-17 23:46:37 +03:00
justheuristic
d9ca0ed905 un-fuse bias 2022-09-17 23:44:28 +03:00
justheuristic
eac9aca460 cast bias too 2022-09-17 23:38:09 +03:00
justheuristic
a9fe0ff98c recast to fp16 2022-09-17 23:34:22 +03:00
justheuristic
fc4a135ed1 clearer assertions 2022-09-17 23:24:26 +03:00
justheuristic
cc4858c2fd some kind of warning or something when this is first executed to make people aware that a cast happens and the operation quantization is performed in fp16. 2022-09-17 20:46:04 +03:00
justheuristic
3634fc738b
Merge branch 'TimDettmers:main' into memory-efficient-backward 2022-09-17 18:42:22 +03:00
Tom Aarsen
58aa7c53f6 Perform check using implicit list length
Instead of 'ccs is not None', as ccs is always a list, so this if is currently always True
2022-09-15 12:44:09 +02:00
Tom Aarsen
7d771d1d6d Catch OSError with specific error number 2022-09-15 11:13:12 +02:00
Tom Aarsen
a58cc5a13c Prevent OSError when env variables contain long value 2022-09-15 10:57:49 +02:00
Tim Dettmers
8ccc0b0ee1 Merge branch 'main' of github.com:TimDettmers/bitsandbytes into main 2022-09-11 11:58:40 -07:00
Tim Dettmers
19a7adca7a Fixed 2^31 max size issue for cpu blockwise quant. 2022-09-11 11:55:09 -07:00
dbaranchuk
e2a75769f2 bug fix 2022-09-11 21:41:46 +03:00
dbaranchuk
4dd475ced4 refactoring 2022-09-11 06:28:17 +03:00
dbaranchuk
d358999e9e refactoring 2022-09-11 06:26:15 +03:00
dbaranchuk
ee325f0215 clarified an exception message 2022-09-11 06:18:44 +03:00
dbaranchuk
42b5fc9acc add memory effcient backward option 2022-09-11 05:51:29 +03:00
Dmitry Baranchuk
843ad0631c
Merge pull request #1 from TimDettmers/main
Update main branch
2022-09-10 19:33:21 -07:00
Tim Dettmers
2e630b55f5 Version bump + bnb.utils import fix. 2022-09-08 13:16:16 -07:00
Tim Dettmers
aca55881b9
Merge branch 'main' into remove_unused_code 2022-09-05 16:29:25 -07:00
dbaranchuk
8d34d36f15 req_gradA for casted & more efficient and accurate fp16 backward 2022-08-29 00:56:08 +03:00
dbaranchuk
b3fee1ed6a add dtype <-> fp16 cast 2022-08-26 04:11:40 +03:00
dbaranchuk
4d6174bc63 memory efficient fp16 backward 2022-08-25 19:09:23 +03:00
Max Ryabinin
92a3363096 Replace print_stderr with warnings.warn 2022-08-24 18:45:17 +03:00
Max Ryabinin
9fc0ab415c Remove unused code 2022-08-24 18:43:18 +03:00
Tim Dettmers
ee5b947e63 Fixed issue where Pascal was not displaying proper error. 2022-08-23 16:00:26 -07:00
dbaranchuk
ef2936a90d delete CxB from state 2022-08-24 01:33:04 +03:00
dbaranchuk
876387dc0c minor fixes 2022-08-24 01:12:48 +03:00
Tim Dettmers
7e0fb655e1 Some initial code. Needs to be tested. 2022-08-23 13:59:34 -07:00
dbaranchuk
656de8ed11 minor fixes 2022-08-23 23:53:43 +03:00
dbaranchuk
1753aa0418 refactoring 2022-08-23 23:51:00 +03:00
dbaranchuk
8ae9bb23ad add memory efficient backward 2022-08-23 23:39:54 +03:00
Tim Dettmers
9d60b3c527 Fixed bug in Linear8bitLt, when the bias is None. 2022-08-17 03:45:57 -07:00
Tim Dettmers
a6664de072 Enhanced error handling in CUDA SETUP failures. 2022-08-16 19:03:19 -07:00
Tim Dettmers
de354f7ded Added fused bias to matmullt. 2022-08-16 12:00:54 -07:00
Tim Dettmers
111b876449 Merge branch 'cuda-bin-switch-and-cli' of github.com:TimDettmers/bitsandbytes into cuda-bin-switch-and-cli 2022-08-16 10:57:10 -07:00
Tim Dettmers
1ed2fa2f21 Removed storage() from get_ptr; added boilerplate for bias dequant_mm. 2022-08-16 10:56:17 -07:00
Tim Dettmers
1ced47c504 Added CUDA version warning and fixed cuda_install for 9.2/10.2. 2022-08-09 20:02:47 -07:00
Tim Dettmers
f9cbe2fe99 Fixed prod Python < 3.7 compatibility in function.py. 2022-08-08 09:13:22 -07:00
Tim Dettmers
62441815bc Removed prod for Python <= 3.7 compatibility. 2022-08-08 05:20:36 -07:00
Tim Dettmers
26efb154c8 Fixed bug where python -m bitsandbytes was failing. 2022-08-07 09:49:36 -07:00
Tim Dettmers
c472bd56f0 Added the case that all env variables are empty (CUDA docker). 2022-08-05 08:57:52 -07:00
Tim Dettmers
e35337f05e Now determining cuda version via libcudart.so call. 2022-08-05 07:13:24 -07:00
Tim Dettmers
8f84674d67 Fixed bugs in cuda setup. 2022-08-04 09:16:00 -07:00
Tim Dettmers
758c7175a2 Merge branch 'debug' into cuda-bin-switch-and-cli 2022-08-04 08:03:00 -07:00
Tim Dettmers
ab72a1294f Added pre/post device call for extract outliers. 2022-08-04 07:47:22 -07:00
Tim Dettmers
cc5b323876 Merge branch 'extract_outliers' into debug 2022-08-04 07:40:48 -07:00
Tim Dettmers
6101a8fb9f Added pre and post device call to transform. 2022-08-04 07:28:12 -07:00
Tim Dettmers
320eacb4c2 Removed print statement. 2022-08-03 14:17:54 -07:00
Tim Dettmers
451fd9506e Added fixes for the case that matmullt dim A is zero, e.g. [0, 768]. 2022-08-03 11:54:01 -07:00
Tim Dettmers
2f01865a2f Added CUDA block assert and is_on_gpu check. 2022-08-03 09:05:37 -07:00
Titus von Koeller
96bc209baf tentative refactoring of the compute capabilities code 2022-08-02 21:27:36 -07:00
Titus von Koeller
59a615b386 factored cuda_setup.main out into smaller modules and functions 2022-08-02 21:26:50 -07:00
Titus von Koeller
3809236428 move cuda_setup code into subpackage 2022-08-02 07:42:27 -07:00
Tim Dettmers
e120c4a550 Fixed syntax error; bumped revision for beta release. 2022-08-01 20:05:03 -07:00
Tim Dettmers
3479d02a76 Added some more docs and comments. 2022-08-01 19:43:09 -07:00
Tim Dettmers
8bf3e9faab Added full env variable search; CONDA_PREFIX priority. 2022-08-01 19:22:41 -07:00
Titus von Koeller
c4fe6c69a3 deleted function that was moved but accidentally not removed in commit 2022-08-01 09:40:41 -07:00
Titus von Koeller
ea7c14f8ef reran black with linelength 80 for greater readability 2022-08-01 09:32:47 -07:00
Titus von Koeller
3fd06fb620 refactored subshell execution code for greater readability and moved it to utils 2022-08-01 09:30:29 -07:00
Titus von Koeller
bfa0e33294 ran black and isort for coherent code formatting 2022-08-01 03:31:48 -07:00
Titus von Koeller
597a8521b2 fix typo 2022-08-01 03:22:44 -07:00
Titus von Koeller
57fa64628f minor refactor to more concise syntax 2022-08-01 03:22:12 -07:00
Tim Dettmers
28d1e7dc01 Initial build script changes (untested on PyPi). 2022-07-31 19:41:56 -07:00
Tim Dettmers
dd50382b32 Full evaluate_cuda setup with integration test. 2022-07-31 17:47:44 -07:00
Titus von Koeller
5d90b38c4d adding CLI tool for CUDA install debugging - intermediate commit 2022-07-27 21:16:04 -07:00
Tim Dettmers
389f66ca5a Fixed direct extraction masking. 2022-07-27 01:46:35 -07:00
Tim Dettmers
5737f2b027 Merge branch 'patch_merge' into extract_outliers 2022-07-26 19:38:01 -07:00
Tim Dettmers
47a73d94c3 Matmullt with direct outlier extraction for 8-bit inference. 2022-07-26 19:15:35 -07:00
Tim Dettmers
cbb901ac51 Boilerplate and test for extract_outliers. 2022-07-26 12:12:38 -07:00
Tim Dettmers
1e88edd8c0 Removed rowscale (segfaults on ampere). 2022-07-25 17:27:57 -07:00
Tim Dettmers
c771b3a75a Most tests passing. 2022-07-22 14:41:05 -07:00
Max Ryabinin
e4cf33f2a3 Fix imports 2022-07-01 17:25:44 +03:00