Tim Dettmers
|
5fab673442
|
Added fp32 compute type for gemv_4bit.
|
2023-07-09 21:06:01 -07:00 |
|
Tim Dettmers
|
4b88d69de7
|
Added abitrary data types; fixed a bug for small matrices.
|
2023-07-09 12:04:09 -07:00 |
|
Tim Dettmers
|
f89ff93e26
|
Initial 4-bit naive batch size 1, 81 vs 185.
|
2023-07-03 18:45:38 -07:00 |
|
Tim Dettmers
|
675baa79d2
|
Merge remote-tracking branch 'origin/main' into merge
|
2023-05-07 13:34:03 -07:00 |
|
Tim Dettmers
|
ec38ba95b0
|
Added paging.
|
2023-05-06 11:14:06 -07:00 |
|
Tim Dettmers
|
21723f796a
|
4-bit draft.
|
2023-04-29 21:52:47 -07:00 |
|
Tim Dettmers
|
cad839941b
|
Added bit template.
|
2023-04-28 22:10:42 -07:00 |
|
Tim Dettmers
|
f3e97ccbd2
|
New implementation for batch size 1.
|
2023-04-28 21:29:40 -07:00 |
|
Tim Dettmers
|
3aef78342a
|
Added template refactor.
|
2023-04-28 17:34:08 -07:00 |
|
Tim Dettmers
|
c1bfb210c5
|
First baseline kernel.
|
2023-04-28 17:19:02 -07:00 |
|
Tim Dettmers
|
9cab14a3ff
|
Adedd pipeline draft.
|
2023-04-27 15:12:49 -07:00 |
|
Tim Dettmers
|
0afc8e9e2f
|
Best attempt at cutlass3.
|
2023-04-26 17:12:34 -07:00 |
|
Tim Dettmers
|
6bfd7a405f
|
Initial template.
|
2023-04-25 16:13:43 -07:00 |
|
Tim Dettmers
|
64cc05920d
|
First draft of NF4.
|
2023-04-02 16:10:35 -07:00 |
|
Phil Wang
|
cb4c3c8c66
|
do a bunch of typical bookkeeping before getting to main lion logic
|
2023-03-09 10:10:19 -08:00 |
|
Tim Dettmers
|
3ac5840c03
|
Added fp4 quant/dequant and dequant optimizations.
|
2023-02-04 14:52:04 -08:00 |
|
Tom Aarsen
|
b104ce3b62
|
Merge branch 'main' into cleanup
|
2022-11-17 15:22:29 +01:00 |
|
Tim Dettmers
|
6bc2b992be
|
Added blocksizes 2048, 1024, and 512 to blockwise quant.
|
2022-11-06 16:27:48 -08:00 |
|
Tom Aarsen
|
1eec77d34c
|
Remove trailing whitespace & ensure newline at EOF
|
2022-10-27 13:11:29 +02:00 |
|
Tim Dettmers
|
1ed2fa2f21
|
Removed storage() from get_ptr; added boilerplate for bias dequant_mm.
|
2022-08-16 10:56:17 -07:00 |
|
Tim Dettmers
|
cbb901ac51
|
Boilerplate and test for extract_outliers.
|
2022-07-26 12:12:38 -07:00 |
|
Tim Dettmers
|
c771b3a75a
|
Most tests passing.
|
2022-07-22 14:41:05 -07:00 |
|
Max Ryabinin
|
8258b4364a
|
Add a CPU-only build option
|
2022-07-01 17:16:10 +03:00 |
|
Tim Dettmers
|
8b3c0f355c
|
Added adagrad with tests (no clipping).
|
2021-11-10 15:10:02 -08:00 |
|
Tim Dettmers
|
bb34fd50a1
|
Initial plumbing for skip_zeros.
|
2021-10-20 18:37:44 -07:00 |
|
Tim Dettmers
|
7439924891
|
Initial commit
|
2021-10-05 19:16:20 -07:00 |
|