Commit Graph

85 Commits

Author SHA1 Message Date
Tim Dettmers
a4532c59f7 Removed faulty asserts. 2022-08-06 09:31:05 -07:00
Tim Dettmers
c472bd56f0 Added the case that all env variables are empty (CUDA docker). 2022-08-05 08:57:52 -07:00
Tim Dettmers
6ad8796cfc Bumping version for TestPyPi release. 2022-08-05 07:17:36 -07:00
Tim Dettmers
e35337f05e Now determining cuda version via libcudart.so call. 2022-08-05 07:13:24 -07:00
Tim Dettmers
8f84674d67 Fixed bugs in cuda setup. 2022-08-04 09:16:00 -07:00
Tim Dettmers
758c7175a2 Merge branch 'debug' into cuda-bin-switch-and-cli 2022-08-04 08:03:00 -07:00
Tim Dettmers
ab72a1294f Added pre/post device call for extract outliers. 2022-08-04 07:47:22 -07:00
Tim Dettmers
cc5b323876 Merge branch 'extract_outliers' into debug 2022-08-04 07:40:48 -07:00
Tim Dettmers
6101a8fb9f Added pre and post device call to transform. 2022-08-04 07:28:12 -07:00
Tim Dettmers
320eacb4c2 Removed print statement. 2022-08-03 14:17:54 -07:00
Tim Dettmers
451fd9506e Added fixes for the case that matmullt dim A is zero, e.g. [0, 768]. 2022-08-03 11:54:01 -07:00
Tim Dettmers
2f01865a2f Added CUDA block assert and is_on_gpu check. 2022-08-03 09:05:37 -07:00
Titus von Koeller
96bc209baf tentative refactoring of the compute capabilities code 2022-08-02 21:27:36 -07:00
Titus von Koeller
59a615b386 factored cuda_setup.main out into smaller modules and functions 2022-08-02 21:26:50 -07:00
Titus von Koeller
3809236428 move cuda_setup code into subpackage 2022-08-02 07:42:27 -07:00
Tim Dettmers
e120c4a550 Fixed syntax error; bumped revision for beta release. 2022-08-01 20:05:03 -07:00
Tim Dettmers
3479d02a76 Added some more docs and comments. 2022-08-01 19:43:09 -07:00
Tim Dettmers
8bf3e9faab Added full env variable search; CONDA_PREFIX priority. 2022-08-01 19:22:41 -07:00
Titus von Koeller
c4fe6c69a3 deleted function that was moved but accidentally not removed in commit 2022-08-01 09:40:41 -07:00
Titus von Koeller
ea7c14f8ef reran black with linelength 80 for greater readability 2022-08-01 09:32:47 -07:00
Titus von Koeller
3fd06fb620 refactored subshell execution code for greater readability and moved it to utils 2022-08-01 09:30:29 -07:00
Titus von Koeller
54efd874a8 flake8 found some stuff that needs fixing before the release 2022-08-01 03:32:34 -07:00
Titus von Koeller
bfa0e33294 ran black and isort for coherent code formatting 2022-08-01 03:31:48 -07:00
Titus von Koeller
597a8521b2 fix typo 2022-08-01 03:22:44 -07:00
Titus von Koeller
57fa64628f minor refactor to more concise syntax 2022-08-01 03:22:12 -07:00
Tim Dettmers
4a6ea7e24b Added adjusted build file. 2022-07-31 20:59:34 -07:00
Tim Dettmers
28d1e7dc01 Initial build script changes (untested on PyPi). 2022-07-31 19:41:56 -07:00
Tim Dettmers
dd50382b32 Full evaluate_cuda setup with integration test. 2022-07-31 17:47:44 -07:00
Titus von Koeller
5d90b38c4d adding CLI tool for CUDA install debugging - intermediate commit 2022-07-27 21:16:04 -07:00
Tim Dettmers
bd515328d7 Fixed deployment script to check for LD_LIBRARY_PATH. 2022-07-27 05:57:50 -07:00
Tim Dettmers
389f66ca5a Fixed direct extraction masking. 2022-07-27 01:46:35 -07:00
Tim Dettmers
a409213656 Fixed make default to compile with cublaslt. 2022-07-26 19:38:17 -07:00
Tim Dettmers
5737f2b027 Merge branch 'patch_merge' into extract_outliers 2022-07-26 19:38:01 -07:00
Tim Dettmers
47a73d94c3 Matmullt with direct outlier extraction for 8-bit inference. 2022-07-26 19:15:35 -07:00
Tim Dettmers
32fa459ed7 Added col_ampere outlier extraction kernel. 2022-07-26 18:15:51 -07:00
Tim Dettmers
bcab99ec87 Working outlier extraction for Turing. 2022-07-26 17:39:30 -07:00
Tim Dettmers
cbb901ac51 Boilerplate and test for extract_outliers. 2022-07-26 12:12:38 -07:00
Tim Dettmers
dc8c9efdb3 Changed setup.py; deployed on test pypi. 2022-07-26 10:32:22 -07:00
Tim Dettmers
953b7285dd Fixed cpuonly build. 2022-07-26 09:12:16 -07:00
Tim Dettmers
f2dd703251 Added matmul build and flags. 2022-07-25 22:34:14 -07:00
Tim Dettmers
9268dc9d88 Some progress on build script; added multi-cuda install script. 2022-07-25 19:30:37 -07:00
Tim Dettmers
1e88edd8c0 Removed rowscale (segfaults on ampere). 2022-07-25 17:27:57 -07:00
Tim Dettmers
8b1fd32e3e Fixed makefile; fixed Ampere igemmlt_8 bug. 2022-07-25 14:02:14 -07:00
Tim Dettmers
7d2ecd30c0 Fixed rowcol synchronization bug. 2022-07-22 15:21:37 -07:00
Tim Dettmers
c771b3a75a Most tests passing. 2022-07-22 14:41:05 -07:00
Tim Dettmers
4cd7ea62b2
Merge pull request #3 from TimDettmers/cpuonly
Add a CPU-only build option
2022-07-18 09:51:37 -07:00
Max Ryabinin
fd750cd237 Update README.md 2022-07-01 17:46:29 +03:00
Max Ryabinin
025824d29b Reduce diff 2022-07-01 17:42:58 +03:00
Max Ryabinin
575aa698fa Reduce diff 2022-07-01 17:41:48 +03:00
Max Ryabinin
4d1d5b569f Reduce diff 2022-07-01 17:40:02 +03:00