Tim Dettmers
|
c00402f17e
|
Fixed a bug in absmax float conversion.
|
2023-07-13 21:47:38 -07:00 |
|
Tim Dettmers
|
67475257a9
|
Added documentation for NF4; failing 8-bit matmul; fixed absmax bug. #529 #543
|
2023-07-13 21:41:43 -07:00 |
|
Tim Dettmers
|
097b1cc5da
|
Fixed bug caused by undefined default type of absmax. #553
|
2023-07-13 21:23:33 -07:00 |
|
Tim Dettmers
|
90b0ac57b0
|
Fixed missing bias in bnb.matmul_4bit for inference; more tests.
|
2023-07-11 17:13:33 -07:00 |
|
Tim Dettmers
|
ba51d95d43
|
Added more extensive gemv tests; blocksize guard for gemv.
|
2023-07-11 05:55:49 -07:00 |
|
Tim Dettmers
|
5fab673442
|
Added fp32 compute type for gemv_4bit.
|
2023-07-09 21:06:01 -07:00 |
|
Tim Dettmers
|
cef519c89e
|
Added test for Param4bit.to() and fixed double quant behavior.
|
2023-07-09 17:16:50 -07:00 |
|
Tim Dettmers
|
6a905be5ce
|
Fixed a bug where gemv_4bit would return a wrongly sized tensor.
|
2023-07-09 15:34:02 -07:00 |
|
Tim Dettmers
|
0f0390acb2
|
Added double quantization support and tests.
|
2023-07-09 15:32:03 -07:00 |
|
Tim Dettmers
|
94168d79d7
|
Added FP4 fast inference support.
|
2023-07-09 14:46:19 -07:00 |
|
Tim Dettmers
|
4b88d69de7
|
Added abitrary data types; fixed a bug for small matrices.
|
2023-07-09 12:04:09 -07:00 |
|
Tim Dettmers
|
02fd80cb81
|
Added bfloat16 quantizations and tests.
|
2023-07-04 19:58:31 -07:00 |
|
Tim Dettmers
|
f89ff93e26
|
Initial 4-bit naive batch size 1, 81 vs 185.
|
2023-07-03 18:45:38 -07:00 |
|
Tim Dettmers
|
1b8772a8f3
|
Added PagedLion and bf16 Lion.
|
2023-05-23 19:37:38 -07:00 |
|
Tim Dettmers
|
2bce175d15
|
Fixed Makefile.
|
2023-05-23 18:42:19 -07:00 |
|
Tim Dettmers
|
675baa79d2
|
Merge remote-tracking branch 'origin/main' into merge
|
2023-05-07 13:34:03 -07:00 |
|
Tim Dettmers
|
41a9c70814
|
Changed prefetching.
|
2023-05-06 18:59:59 -07:00 |
|
Tim Dettmers
|
44d68ff29c
|
Added paged optimizers.
|
2023-05-06 14:59:29 -07:00 |
|
Tim Dettmers
|
ec38ba95b0
|
Added paging.
|
2023-05-06 11:14:06 -07:00 |
|
Tim Dettmers
|
264a948539
|
4-bit draft; 128 vector load 240.
|
2023-05-02 16:15:38 -07:00 |
|
Tim Dettmers
|
f9bfea8f23
|
Baseline for debugging.
|
2023-05-02 07:24:12 -07:00 |
|
Tim Dettmers
|
21723f796a
|
4-bit draft.
|
2023-04-29 21:52:47 -07:00 |
|
Tim Dettmers
|
f6df4aef6a
|
Added fp16 and thread/item template.
|
2023-04-28 18:26:52 -07:00 |
|
Tim Dettmers
|
3aef78342a
|
Added template refactor.
|
2023-04-28 17:34:08 -07:00 |
|
Tim Dettmers
|
c1bfb210c5
|
First baseline kernel.
|
2023-04-28 17:19:02 -07:00 |
|
Tim Dettmers
|
9cab14a3ff
|
Adedd pipeline draft.
|
2023-04-27 15:12:49 -07:00 |
|
Tim Dettmers
|
d1c4c20568
|
Added non-cutlass template.
|
2023-04-27 15:11:26 -07:00 |
|
Tim Dettmers
|
0afc8e9e2f
|
Best attempt at cutlass3.
|
2023-04-26 17:12:34 -07:00 |
|
Tim Dettmers
|
84964db937
|
CUTLASS compiles.
|
2023-04-25 17:15:51 -07:00 |
|
Tim Dettmers
|
0f9d30207f
|
Added nested quantization for blockwise quantization.
|
2023-04-19 11:48:47 -07:00 |
|
Tim Dettmers
|
7dc198feb7
|
Added 32-bit optimizer for bfloat16 gradients.
|
2023-04-17 18:01:49 -07:00 |
|
Tim Dettmers
|
7140c01405
|
Merge branch 'main' into fp8_merge
|
2023-04-12 11:44:39 -07:00 |
|
Tim Dettmers
|
2bb5c00ba9
|
Added pre/post call to all lib calls. Fixes #120
|
2023-04-11 09:36:56 -07:00 |
|
Tim Dettmers
|
b0ec20c3b3
|
Merge pull request #188 from lucidrains/main
Lion 8 bit
|
2023-04-11 07:22:45 -07:00 |
|
Tim Dettmers
|
e9fa03b717
|
Some fixed for loading PEFT modules with Params4bit.
|
2023-04-07 09:59:21 -07:00 |
|
Tim Dettmers
|
4ea489d3bf
|
Refactor FP4 into 4Bit and integrate NF4 data type.
|
2023-04-03 11:00:12 -07:00 |
|
Tim Dettmers
|
64cc05920d
|
First draft of NF4.
|
2023-04-02 16:10:35 -07:00 |
|
Tim Dettmers
|
4ad999d144
|
Added quantization tree generation.
|
2023-04-02 14:42:45 -07:00 |
|
Tim Dettmers
|
0d332a641f
|
Added normal with extra value.
|
2023-04-02 14:09:08 -07:00 |
|
Tim Dettmers
|
51a21df728
|
Added 8-bit compression to quantization statistics.
|
2023-04-01 16:10:18 -07:00 |
|
Tim Dettmers
|
c4cfe4fbdd
|
Added bf16 Adam.
|
2023-04-01 10:33:03 -07:00 |
|
Tim Dettmers
|
8645d1f71c
|
Added normal quant.
|
2023-03-29 18:41:37 -07:00 |
|
Ji Lin
|
b6383ba116
|
fix a bug in quantize_no_absmax and dequantize_no_absmax with multiple gpus
|
2023-03-22 22:14:57 -04:00 |
|
Phil Wang
|
cb4c3c8c66
|
do a bunch of typical bookkeeping before getting to main lion logic
|
2023-03-09 10:10:19 -08:00 |
|
Tim Dettmers
|
2489d819c5
|
Added more blocksizes for stochastic rounding; fixed dequant blocksize.
|
2023-02-14 13:55:17 -08:00 |
|
Tim Dettmers
|
cfe4705e32
|
Added matmul_fp4 to the benchmark.
|
2023-02-04 22:00:04 -08:00 |
|
Tim Dettmers
|
160a83580d
|
Forward matmul_fp4 tests pass.
|
2023-02-04 21:11:21 -08:00 |
|
Tim Dettmers
|
3ac5840c03
|
Added fp4 quant/dequant and dequant optimizations.
|
2023-02-04 14:52:04 -08:00 |
|
Tim Dettmers
|
c9f505064e
|
Added outlier detector and fake quantization layer.
|
2023-01-28 17:05:22 -08:00 |
|
Tim Dettmers
|
c91f592ad7
|
Merge branch 'main' into cleanup
|
2023-01-02 11:19:16 +01:00 |
|