Commit Graph

220 Commits

Author SHA1 Message Date
Tim Dettmers
5b612bc6df Added is_available_triton guard to Triton SwitchBackLinear. 2023-04-12 12:16:55 -07:00
Tim Dettmers
c3d87e4435 Added is_available_triton guard. 2023-04-12 12:10:34 -07:00
Tim Dettmers
7140c01405 Merge branch 'main' into fp8_merge 2023-04-12 11:44:39 -07:00
Tim Dettmers
dd562c24f1 Refactored simulated fp8 modules into research.nn. 2023-04-12 11:24:44 -07:00
Tim Dettmers
e67bfccbcd Added missing triton and fp8 files. 2023-04-12 10:06:18 -07:00
Tim Dettmers
ec1ea63711 Refactored triton into its own folder. Refactored fp8 matmuls. 2023-04-12 09:39:39 -07:00
Tim Dettmers
7c651012fc Added better error message for debugging on CUDA not detected failures. 2023-04-12 07:56:52 -07:00
Tim Dettmers
659a7dfc71 Fixing #300. 2023-04-11 16:14:29 -07:00
Tim Dettmers
89e3b82731 Added more detailed cuda setup debug and debugging instructions. 2023-04-11 13:47:10 -07:00
Tim Dettmers
4cd63deff3 Fixed CUDA Conda PyTorch 2.0 issues. 2023-04-11 12:10:20 -07:00
Tim Dettmers
2bb5c00ba9 Added pre/post call to all lib calls. Fixes #120 2023-04-11 09:36:56 -07:00
Tim Dettmers
2eb3108356 Fixed bug where beta2 was not passed into Lion 32-bit. 2023-04-11 09:16:01 -07:00
Tim Dettmers
ed6f3eb146
Merge pull request #159 from TimDettmers/serialize_8bit
Implement proper serialization of Linear8bitLt
2023-04-11 07:24:51 -07:00
Tim Dettmers
b0ec20c3b3
Merge pull request #188 from lucidrains/main
Lion 8 bit
2023-04-11 07:22:45 -07:00
Tim Dettmers
d3e0e39def
Merge pull request #190 from svgsponer/Fix#157
Fix #157; Add XDG_GREETER_DATA_DIR to ignorelist
2023-04-11 07:20:16 -07:00
Tim Dettmers
c7875533ce
Merge pull request #213 from tonylins/dev/fix_no_absmax
Gix a bug in (de)quantize_no_absmax with multiple GPUs
2023-04-11 07:18:24 -07:00
Tim Dettmers
6b4c5afe21
Merge pull request #260 from rapsealk/fix_libsbitsandbytes_cpu_so
Fixed typo libsbitsandbytes_cpu.so
2023-04-11 07:15:42 -07:00
justheuristic
5e456be50e
Support 1650, 1660 2023-04-10 21:26:52 +03:00
Mitchell Wortsman
d677a71607 typo 2023-04-08 19:36:17 +00:00
Mitchell Wortsman
da524d97c9 mem efficient" 2023-04-08 19:34:18 +00:00
Jeongseok Kang
8cceff72db Fixed typo libsbitsandbytes_cpu.so 2023-04-05 09:28:41 +09:00
Mitchell Wortsman
7f87ba83ee cleaning and refactor 2023-04-01 18:46:04 +00:00
Tim Dettmers
a13a522c4c Added first triton test. 2023-03-31 11:20:54 -07:00
Mitchell Wortsman
5f3d9ada8d triton-v1 2023-03-29 06:47:08 +00:00
Mitchell Wortsman
51f8bb7133 pre-triton update 2023-03-24 05:44:42 +00:00
Ji Lin
b6383ba116 fix a bug in quantize_no_absmax and dequantize_no_absmax with multiple gpus 2023-03-22 22:14:57 -04:00
Severin Gsponer
c4866ab06e Fix #157; Add XDG_GREETER_DATA_DIR to ignorelist 2023-03-11 15:35:23 +01:00
Phil Wang
19b9ef34b9 whoops 2023-03-10 08:59:49 -08:00
Phil Wang
c99b44f774 do the epsilon beta2 switcharoo within the cuda code, and not within the python class (so that the state dict still makes sense) 2023-03-10 08:57:59 -08:00
Phil Wang
c83888aa1a use epsilon as beta2 for lion, complete most of the logic in kernel.cu for all functions 2023-03-09 11:54:54 -08:00
Phil Wang
cb4c3c8c66 do a bunch of typical bookkeeping before getting to main lion logic 2023-03-09 10:10:19 -08:00
Phil Wang
d43ea9722c make sure interface is correct 2023-03-09 09:45:33 -08:00
Phil Wang
7247cb4554 initial commit, slowly work from interface into the kernel 2023-03-09 08:08:46 -08:00
Max Ryabinin
24609b66af Reduce diff 2023-02-25 06:24:58 +01:00
Max Ryabinin
d15822a54b Refactor _tile_indices into a cached property, fix device bug 2023-02-25 06:23:07 +01:00
Max Ryabinin
cc608c04c2 Revert the layout if weights were reordered 2023-02-25 06:02:06 +01:00
Max Ryabinin
cd4d904a4c Raise an error when loading a quantized checkpoint before quantization 2023-02-25 06:01:34 +01:00
Mitchell Wortsman
75377d125e new experiments 2023-02-24 00:10:15 +00:00
Tim Dettmers
5d2e23e8d6 Merge branch 'fp8sim' of github.com:TimDettmers/bitsandbytes into fp8sim 2023-02-23 10:56:49 -08:00
Tim Dettmers
c5c38ca19c Added matmul_mixed. 2023-02-23 10:45:18 -08:00
Mitchell Wortsman
3fbf60ad83 sim now worse than real 2023-02-23 08:27:15 +00:00
Max Ryabinin
58b09ee1b1 [WIP] Implement proper serialization of Linear8bitLt 2023-02-21 12:04:47 +01:00
Mitchell Wortsman
7b764d3569 adding half() cast 2023-02-21 03:53:44 +00:00
Tim Dettmers
2489d819c5 Added more blocksizes for stochastic rounding; fixed dequant blocksize. 2023-02-14 13:55:17 -08:00
Tim Dettmers
2dfa3ce16d Fixed LinearFP8 and added tests. 2023-02-13 17:48:52 -08:00
Tim Dettmers
fa255cbc56 Added missing import. 2023-02-13 17:29:39 -08:00
Tim Dettmers
ca3236587a Added forward/backward tests; removed bias. 2023-02-13 17:20:52 -08:00
Tim Dettmers
6bdb6c351e Added fp8 simulation layer. 2023-02-13 16:53:07 -08:00
Kashif Rasul
c52365ac1d
Merge branch 'main' into patch-1 2023-02-03 09:01:48 +01:00
Tim Dettmers
0f5c394870 Added version 0.37.0. 2023-02-01 20:27:01 -08:00