Commit Graph

264 Commits

Author SHA1 Message Date
Mitchell Wortsman
b373034e31 test 2023-03-29 19:04:53 +00:00
Mitchell Wortsman
5f3d9ada8d triton-v1 2023-03-29 06:47:08 +00:00
Mitchell Wortsman
51f8bb7133 pre-triton update 2023-03-24 05:44:42 +00:00
Mitchell Wortsman
75377d125e new experiments 2023-02-24 00:10:15 +00:00
Tim Dettmers
5d2e23e8d6 Merge branch 'fp8sim' of github.com:TimDettmers/bitsandbytes into fp8sim 2023-02-23 10:56:49 -08:00
Tim Dettmers
c5c38ca19c Added matmul_mixed. 2023-02-23 10:45:18 -08:00
Mitchell Wortsman
3fbf60ad83 sim now worse than real 2023-02-23 08:27:15 +00:00
Mitchell Wortsman
7b764d3569 adding half() cast 2023-02-21 03:53:44 +00:00
Tim Dettmers
2489d819c5 Added more blocksizes for stochastic rounding; fixed dequant blocksize. 2023-02-14 13:55:17 -08:00
Tim Dettmers
2dfa3ce16d Fixed LinearFP8 and added tests. 2023-02-13 17:48:52 -08:00
Tim Dettmers
fa255cbc56 Added missing import. 2023-02-13 17:29:39 -08:00
Tim Dettmers
ca3236587a Added forward/backward tests; removed bias. 2023-02-13 17:20:52 -08:00
Tim Dettmers
6bdb6c351e Added fp8 simulation layer. 2023-02-13 16:53:07 -08:00
Tim Dettmers
c9f505064e Added outlier detector and fake quantization layer. 2023-01-28 17:05:22 -08:00
Tim Dettmers
1341fb44ad Fixed issue where the CUDA SETUP was not printed. 2023-01-04 03:50:53 -08:00
Tim Dettmers
3901ebf7ae Added CUDA 12.0 support; removed CC 3.0 support. 2023-01-04 02:28:33 -08:00
Tim Dettmers
b3de19218e Added error message for unexpected CUDA exception. 2023-01-03 06:57:07 -08:00
Tim Dettmers
81990491ff
Merge pull request #113 from Borzik/fix-warnings
Import missing warn function
2023-01-03 15:46:58 +01:00
Tim Dettmers
9180b4cc11 Added additional error message for cudart error #85 2023-01-03 06:44:11 -08:00
Tim Dettmers
dfb049f8e4 Added Python >= 3.8 requirement. 2023-01-03 06:20:06 -08:00
Tim Dettmers
211ad594df Added error+instructions for unsupported CUDA 10.0 version #82 2023-01-03 06:07:35 -08:00
Felix Borzik
f3800bab75 import warn function 2023-01-03 13:23:34 +00:00
Tim Dettmers
9d353ca786
Merge pull request #87 from lostmsu/main
Add `device` and `dtype` parameters to `StableEmbedding`
2023-01-02 13:22:45 +01:00
Tim Dettmers
7a6563b6c8 Default to CPU library on CUDA error+small refactor. 2023-01-02 03:47:09 -08:00
Tim Dettmers
d9112dc55b
Merge pull request #110 from BlackHC/cublaslt_version
Improve cc version detection for cublaslt
2023-01-02 12:35:53 +01:00
Tim Dettmers
336e24696c CUDASetup only executed once + fixed circular import. 2023-01-02 03:31:43 -08:00
Tim Dettmers
df9a9b0c4c
Merge pull request #77 from Cyberes/main
Allow hiding of the welcome message
2023-01-02 11:28:17 +01:00
Tim Dettmers
be5cecb88f
Merge branch 'main' into main 2023-01-02 11:23:17 +01:00
Tim Dettmers
f0ec93d016
Merge pull request #76 from tomaarsen/cleanup
Cleanup involving a handful of failures, some optimization and a lot of code quality improvements
2023-01-02 11:19:28 +01:00
Tim Dettmers
c91f592ad7
Merge branch 'main' into cleanup 2023-01-02 11:19:16 +01:00
blackhc
ed17aa9a31 Don't mark it as failure though. 2022-12-29 23:50:48 +00:00
blackhc
7b39a5511d Fix issue #97 2022-12-29 23:47:21 +00:00
Tim Dettmers
c059bd2848 Added additional blocksizes: {64, 128, 256}. 2022-11-20 14:18:15 -08:00
Tim Dettmers
eb028e6ebc Fixed k-bit quantization maps. 2022-11-19 07:24:03 -08:00
Tom Aarsen
b104ce3b62
Merge branch 'main' into cleanup 2022-11-17 15:22:29 +01:00
Tim Dettmers
08fa2e7b01 Fixed bug in cpu quant; faster GPU dequant. 2022-11-07 18:06:18 -08:00
Tim Dettmers
62a333ac40 Added pre/post calls do quantize_blockwise. 2022-11-06 17:17:51 -08:00
Tim Dettmers
e0e697b150 Fixed blockwise test and logic. 2022-11-06 16:36:31 -08:00
Tim Dettmers
6bc2b992be Added blocksizes 2048, 1024, and 512 to blockwise quant. 2022-11-06 16:27:48 -08:00
Tim Dettmers
2f2063bac2 Added k<256 quantile estimate. 2022-11-06 13:05:25 -08:00
Tim Dettmers
98cbc4bc4f Added k-bit fp8 map. 2022-11-06 11:59:37 -08:00
Tim Dettmers
caf1832526 Added k-bit linear quantization. 2022-11-06 11:47:54 -08:00
Victor Nova
62d39a237c
add device and dtype parameters to StableEmbedding 2022-11-04 14:12:46 -07:00
Tim Dettmers
1efb87d89d Added FP8 quantization map. 2022-11-03 19:49:50 -07:00
Tom Aarsen
62c0bd2278 Fix several typos in logging and comments
Via codespell
2022-11-01 09:53:47 +01:00
Tom Aarsen
d504050ff7 Call isort over cuda_setup/main.py 2022-11-01 09:46:03 +01:00
Tom Aarsen
30f28b94a0
Merge branch 'main' into cleanup 2022-11-01 09:43:49 +01:00
Tim Dettmers
8d87c0b852 Fixed CUDA setup bugs, including #81. 2022-10-31 18:04:49 -07:00
adpkadspokasdk
8724c990c7 allow hiding of the welcome message 2022-10-27 16:04:49 -06:00
Tom Aarsen
2a91e15113 Remove outdated linter log 2022-10-27 20:50:49 +02:00