Tim Dettmers
|
de354f7ded
|
Added fused bias to matmullt.
|
2022-08-16 12:00:54 -07:00 |
|
Tim Dettmers
|
dede343033
|
Added fused bias in dequant_mm.
|
2022-08-16 11:12:09 -07:00 |
|
Tim Dettmers
|
111b876449
|
Merge branch 'cuda-bin-switch-and-cli' of github.com:TimDettmers/bitsandbytes into cuda-bin-switch-and-cli
|
2022-08-16 10:57:10 -07:00 |
|
Tim Dettmers
|
1ed2fa2f21
|
Removed storage() from get_ptr; added boilerplate for bias dequant_mm.
|
2022-08-16 10:56:17 -07:00 |
|
Tim Dettmers
|
1ced47c504
|
Added CUDA version warning and fixed cuda_install for 9.2/10.2.
|
2022-08-09 20:02:47 -07:00 |
|
Tim Dettmers
|
f9cbe2fe99
|
Fixed prod Python < 3.7 compatibility in function.py.
|
2022-08-08 09:13:22 -07:00 |
|
Tim Dettmers
|
62441815bc
|
Removed prod for Python <= 3.7 compatibility.
|
2022-08-08 05:20:36 -07:00 |
|
Tim Dettmers
|
26efb154c8
|
Fixed bug where python -m bitsandbytes was failing.
|
2022-08-07 09:49:36 -07:00 |
|
Tim Dettmers
|
a4532c59f7
|
Removed faulty asserts.
|
2022-08-06 09:31:05 -07:00 |
|
Tim Dettmers
|
c472bd56f0
|
Added the case that all env variables are empty (CUDA docker).
|
2022-08-05 08:57:52 -07:00 |
|
Tim Dettmers
|
6ad8796cfc
|
Bumping version for TestPyPi release.
|
2022-08-05 07:17:36 -07:00 |
|
Tim Dettmers
|
e35337f05e
|
Now determining cuda version via libcudart.so call.
|
2022-08-05 07:13:24 -07:00 |
|
Tim Dettmers
|
8f84674d67
|
Fixed bugs in cuda setup.
|
2022-08-04 09:16:00 -07:00 |
|
Tim Dettmers
|
758c7175a2
|
Merge branch 'debug' into cuda-bin-switch-and-cli
|
2022-08-04 08:03:00 -07:00 |
|
Tim Dettmers
|
ab72a1294f
|
Added pre/post device call for extract outliers.
|
2022-08-04 07:47:22 -07:00 |
|
Tim Dettmers
|
cc5b323876
|
Merge branch 'extract_outliers' into debug
|
2022-08-04 07:40:48 -07:00 |
|
Tim Dettmers
|
6101a8fb9f
|
Added pre and post device call to transform.
|
2022-08-04 07:28:12 -07:00 |
|
Tim Dettmers
|
320eacb4c2
|
Removed print statement.
|
2022-08-03 14:17:54 -07:00 |
|
Tim Dettmers
|
451fd9506e
|
Added fixes for the case that matmullt dim A is zero, e.g. [0, 768].
|
2022-08-03 11:54:01 -07:00 |
|
Tim Dettmers
|
2f01865a2f
|
Added CUDA block assert and is_on_gpu check.
|
2022-08-03 09:05:37 -07:00 |
|
Titus von Koeller
|
96bc209baf
|
tentative refactoring of the compute capabilities code
|
2022-08-02 21:27:36 -07:00 |
|
Titus von Koeller
|
59a615b386
|
factored cuda_setup.main out into smaller modules and functions
|
2022-08-02 21:26:50 -07:00 |
|
Titus von Koeller
|
3809236428
|
move cuda_setup code into subpackage
|
2022-08-02 07:42:27 -07:00 |
|
Tim Dettmers
|
e120c4a550
|
Fixed syntax error; bumped revision for beta release.
|
2022-08-01 20:05:03 -07:00 |
|
Tim Dettmers
|
3479d02a76
|
Added some more docs and comments.
|
2022-08-01 19:43:09 -07:00 |
|
Tim Dettmers
|
8bf3e9faab
|
Added full env variable search; CONDA_PREFIX priority.
|
2022-08-01 19:22:41 -07:00 |
|
Titus von Koeller
|
c4fe6c69a3
|
deleted function that was moved but accidentally not removed in commit
|
2022-08-01 09:40:41 -07:00 |
|
Titus von Koeller
|
ea7c14f8ef
|
reran black with linelength 80 for greater readability
|
2022-08-01 09:32:47 -07:00 |
|
Titus von Koeller
|
3fd06fb620
|
refactored subshell execution code for greater readability and moved it to utils
|
2022-08-01 09:30:29 -07:00 |
|
Titus von Koeller
|
54efd874a8
|
flake8 found some stuff that needs fixing before the release
|
2022-08-01 03:32:34 -07:00 |
|
Titus von Koeller
|
bfa0e33294
|
ran black and isort for coherent code formatting
|
2022-08-01 03:31:48 -07:00 |
|
Titus von Koeller
|
597a8521b2
|
fix typo
|
2022-08-01 03:22:44 -07:00 |
|
Titus von Koeller
|
57fa64628f
|
minor refactor to more concise syntax
|
2022-08-01 03:22:12 -07:00 |
|
Tim Dettmers
|
4a6ea7e24b
|
Added adjusted build file.
|
2022-07-31 20:59:34 -07:00 |
|
Tim Dettmers
|
28d1e7dc01
|
Initial build script changes (untested on PyPi).
|
2022-07-31 19:41:56 -07:00 |
|
Tim Dettmers
|
dd50382b32
|
Full evaluate_cuda setup with integration test.
|
2022-07-31 17:47:44 -07:00 |
|
Titus von Koeller
|
5d90b38c4d
|
adding CLI tool for CUDA install debugging - intermediate commit
|
2022-07-27 21:16:04 -07:00 |
|
Tim Dettmers
|
bd515328d7
|
Fixed deployment script to check for LD_LIBRARY_PATH.
|
2022-07-27 05:57:50 -07:00 |
|
Tim Dettmers
|
389f66ca5a
|
Fixed direct extraction masking.
|
2022-07-27 01:46:35 -07:00 |
|
Tim Dettmers
|
a409213656
|
Fixed make default to compile with cublaslt.
|
2022-07-26 19:38:17 -07:00 |
|
Tim Dettmers
|
5737f2b027
|
Merge branch 'patch_merge' into extract_outliers
|
2022-07-26 19:38:01 -07:00 |
|
Tim Dettmers
|
47a73d94c3
|
Matmullt with direct outlier extraction for 8-bit inference.
|
2022-07-26 19:15:35 -07:00 |
|
Tim Dettmers
|
32fa459ed7
|
Added col_ampere outlier extraction kernel.
|
2022-07-26 18:15:51 -07:00 |
|
Tim Dettmers
|
bcab99ec87
|
Working outlier extraction for Turing.
|
2022-07-26 17:39:30 -07:00 |
|
Tim Dettmers
|
cbb901ac51
|
Boilerplate and test for extract_outliers.
|
2022-07-26 12:12:38 -07:00 |
|
Tim Dettmers
|
dc8c9efdb3
|
Changed setup.py; deployed on test pypi.
|
2022-07-26 10:32:22 -07:00 |
|
Tim Dettmers
|
953b7285dd
|
Fixed cpuonly build.
|
2022-07-26 09:12:16 -07:00 |
|
Tim Dettmers
|
f2dd703251
|
Added matmul build and flags.
|
2022-07-25 22:34:14 -07:00 |
|
Tim Dettmers
|
9268dc9d88
|
Some progress on build script; added multi-cuda install script.
|
2022-07-25 19:30:37 -07:00 |
|
Tim Dettmers
|
1e88edd8c0
|
Removed rowscale (segfaults on ampere).
|
2022-07-25 17:27:57 -07:00 |
|