Tim Dettmers
|
5b612bc6df
|
Added is_available_triton guard to Triton SwitchBackLinear.
|
2023-04-12 12:16:55 -07:00 |
|
Tim Dettmers
|
c3d87e4435
|
Added is_available_triton guard.
|
2023-04-12 12:10:34 -07:00 |
|
Tim Dettmers
|
7140c01405
|
Merge branch 'main' into fp8_merge
|
2023-04-12 11:44:39 -07:00 |
|
Tim Dettmers
|
32f8c89201
|
Added missing example folder.
|
2023-04-12 11:27:31 -07:00 |
|
Tim Dettmers
|
dd562c24f1
|
Refactored simulated fp8 modules into research.nn.
|
2023-04-12 11:24:44 -07:00 |
|
Tim Dettmers
|
e67bfccbcd
|
Added missing triton and fp8 files.
|
2023-04-12 10:06:18 -07:00 |
|
Tim Dettmers
|
ec1ea63711
|
Refactored triton into its own folder. Refactored fp8 matmuls.
|
2023-04-12 09:39:39 -07:00 |
|
Tim Dettmers
|
7c651012fc
|
Added better error message for debugging on CUDA not detected failures.
|
2023-04-12 07:56:52 -07:00 |
|
Tim Dettmers
|
659a7dfc71
|
Fixing #300.
|
2023-04-11 16:14:29 -07:00 |
|
Tim Dettmers
|
eb1c331c84
|
Updates README and CHANGELOG.
|
2023-04-11 15:49:01 -07:00 |
|
Tim Dettmers
|
89e3b82731
|
Added more detailed cuda setup debug and debugging instructions.
|
2023-04-11 13:47:10 -07:00 |
|
Tim Dettmers
|
4cd63deff3
|
Fixed CUDA Conda PyTorch 2.0 issues.
|
2023-04-11 12:10:20 -07:00 |
|
Tim Dettmers
|
2bb5c00ba9
|
Added pre/post call to all lib calls. Fixes #120
|
2023-04-11 09:36:56 -07:00 |
|
Tim Dettmers
|
29ab3a6b14
|
Updated change log.
|
2023-04-11 09:26:52 -07:00 |
|
Tim Dettmers
|
2eb3108356
|
Fixed bug where beta2 was not passed into Lion 32-bit.
|
2023-04-11 09:16:01 -07:00 |
|
Tim Dettmers
|
792af5c883
|
Fixed noisy tests for 8-bit Lion.
|
2023-04-11 08:42:41 -07:00 |
|
Tim Dettmers
|
0b2ebcdab9
|
Added launch bounds to fix launch resource error for Lion.
|
2023-04-11 08:37:02 -07:00 |
|
Tim Dettmers
|
ed6f3eb146
|
Merge pull request #159 from TimDettmers/serialize_8bit
Implement proper serialization of Linear8bitLt
|
2023-04-11 07:24:51 -07:00 |
|
Tim Dettmers
|
b0ec20c3b3
|
Merge pull request #188 from lucidrains/main
Lion 8 bit
|
2023-04-11 07:22:45 -07:00 |
|
Tim Dettmers
|
d3e0e39def
|
Merge pull request #190 from svgsponer/Fix#157
Fix #157; Add XDG_GREETER_DATA_DIR to ignorelist
|
2023-04-11 07:20:16 -07:00 |
|
Tim Dettmers
|
c7875533ce
|
Merge pull request #213 from tonylins/dev/fix_no_absmax
Gix a bug in (de)quantize_no_absmax with multiple GPUs
|
2023-04-11 07:18:24 -07:00 |
|
Tim Dettmers
|
6b4c5afe21
|
Merge pull request #260 from rapsealk/fix_libsbitsandbytes_cpu_so
Fixed typo libsbitsandbytes_cpu.so
|
2023-04-11 07:15:42 -07:00 |
|
Tim Dettmers
|
72efa32962
|
Merge pull request #292 from justheuristic/patch-2
Support nvidia16 GPUs
|
2023-04-11 07:14:12 -07:00 |
|
justheuristic
|
5e456be50e
|
Support 1650, 1660
|
2023-04-10 21:26:52 +03:00 |
|
Mitchell Wortsman
|
d677a71607
|
typo
|
2023-04-08 19:36:17 +00:00 |
|
Mitchell Wortsman
|
da524d97c9
|
mem efficient"
|
2023-04-08 19:34:18 +00:00 |
|
Tim Dettmers
|
e9fa03b717
|
Some fixed for loading PEFT modules with Params4bit.
|
2023-04-07 09:59:21 -07:00 |
|
Jeongseok Kang
|
8cceff72db
|
Fixed typo libsbitsandbytes_cpu.so
|
2023-04-05 09:28:41 +09:00 |
|
Tim Dettmers
|
1ccb7bdec6
|
Fixed ParamsIn4 init; fixed PyTorch 2.0 test failure.
|
2023-04-03 18:47:00 -07:00 |
|
Tim Dettmers
|
4ea489d3bf
|
Refactor FP4 into 4Bit and integrate NF4 data type.
|
2023-04-03 11:00:12 -07:00 |
|
Tim Dettmers
|
64cc05920d
|
First draft of NF4.
|
2023-04-02 16:10:35 -07:00 |
|
Tim Dettmers
|
4ad999d144
|
Added quantization tree generation.
|
2023-04-02 14:42:45 -07:00 |
|
Tim Dettmers
|
0d332a641f
|
Added normal with extra value.
|
2023-04-02 14:09:08 -07:00 |
|
Tim Dettmers
|
2dd5d69056
|
Generalized FP4 data type.
|
2023-04-02 12:42:01 -07:00 |
|
Mitchell Wortsman
|
eb6c53cf55
|
clarify in readme
|
2023-04-01 23:50:12 +00:00 |
|
Tim Dettmers
|
51a21df728
|
Added 8-bit compression to quantization statistics.
|
2023-04-01 16:10:18 -07:00 |
|
Mitchell Wortsman
|
2331212b35
|
add readme for speed bench
|
2023-04-01 19:13:15 +00:00 |
|
Mitchell Wortsman
|
7f87ba83ee
|
cleaning and refactor
|
2023-04-01 18:46:04 +00:00 |
|
Tim Dettmers
|
c4cfe4fbdd
|
Added bf16 Adam.
|
2023-04-01 10:33:03 -07:00 |
|
Tim Dettmers
|
30d21d585c
|
Added triton test.
|
2023-03-31 11:33:26 -07:00 |
|
Tim Dettmers
|
a13a522c4c
|
Added first triton test.
|
2023-03-31 11:20:54 -07:00 |
|
Tim Dettmers
|
8645d1f71c
|
Added normal quant.
|
2023-03-29 18:41:37 -07:00 |
|
Mitchell Wortsman
|
b373034e31
|
test
|
2023-03-29 19:04:53 +00:00 |
|
Mitchell Wortsman
|
5f3d9ada8d
|
triton-v1
|
2023-03-29 06:47:08 +00:00 |
|
Tim Dettmers
|
69810521d3
|
Some small changes.
|
2023-03-27 09:12:57 -07:00 |
|
Mitchell Wortsman
|
51f8bb7133
|
pre-triton update
|
2023-03-24 05:44:42 +00:00 |
|
Ji Lin
|
b6383ba116
|
fix a bug in quantize_no_absmax and dequantize_no_absmax with multiple gpus
|
2023-03-22 22:14:57 -04:00 |
|
Phil Wang
|
2a6828e6fb
|
fix comment
|
2023-03-22 09:56:50 -07:00 |
|
Phil Wang
|
978ba2db57
|
another tab/spaces fix
|
2023-03-22 09:33:47 -07:00 |
|
Phil Wang
|
916000c8bf
|
fix consistent tabs / spaces
|
2023-03-22 09:27:13 -07:00 |
|