Phil Wang
|
c5582724d5
|
missed adagrad
|
2023-03-09 14:05:45 -08:00 |
|
Phil Wang
|
af03430992
|
fix weight decay for lion to be decoupled, using a switch
|
2023-03-09 14:03:07 -08:00 |
|
Phil Wang
|
ead570a43e
|
remove something rmsprop specific
|
2023-03-09 11:58:31 -08:00 |
|
Phil Wang
|
c83888aa1a
|
use epsilon as beta2 for lion, complete most of the logic in kernel.cu for all functions
|
2023-03-09 11:54:54 -08:00 |
|
Phil Wang
|
64bb1ae8d1
|
add a sign function, for lion
|
2023-03-09 11:10:28 -08:00 |
|
Phil Wang
|
cb4c3c8c66
|
do a bunch of typical bookkeeping before getting to main lion logic
|
2023-03-09 10:10:19 -08:00 |
|
Tim Dettmers
|
c91f592ad7
|
Merge branch 'main' into cleanup
|
2023-01-02 11:19:16 +01:00 |
|
Tim Dettmers
|
c059bd2848
|
Added additional blocksizes: {64, 128, 256}.
|
2022-11-20 14:18:15 -08:00 |
|
Tom Aarsen
|
b104ce3b62
|
Merge branch 'main' into cleanup
|
2022-11-17 15:22:29 +01:00 |
|
Tim Dettmers
|
08fa2e7b01
|
Fixed bug in cpu quant; faster GPU dequant.
|
2022-11-07 18:06:18 -08:00 |
|
Tim Dettmers
|
6bc2b992be
|
Added blocksizes 2048, 1024, and 512 to blockwise quant.
|
2022-11-06 16:27:48 -08:00 |
|
Tom Aarsen
|
1eec77d34c
|
Remove trailing whitespace & ensure newline at EOF
|
2022-10-27 13:11:29 +02:00 |
|
Tim Dettmers
|
dede343033
|
Added fused bias in dequant_mm.
|
2022-08-16 11:12:09 -07:00 |
|
Tim Dettmers
|
1ed2fa2f21
|
Removed storage() from get_ptr; added boilerplate for bias dequant_mm.
|
2022-08-16 10:56:17 -07:00 |
|
Tim Dettmers
|
5737f2b027
|
Merge branch 'patch_merge' into extract_outliers
|
2022-07-26 19:38:01 -07:00 |
|
Tim Dettmers
|
32fa459ed7
|
Added col_ampere outlier extraction kernel.
|
2022-07-26 18:15:51 -07:00 |
|
Tim Dettmers
|
bcab99ec87
|
Working outlier extraction for Turing.
|
2022-07-26 17:39:30 -07:00 |
|
Tim Dettmers
|
cbb901ac51
|
Boilerplate and test for extract_outliers.
|
2022-07-26 12:12:38 -07:00 |
|
Tim Dettmers
|
9268dc9d88
|
Some progress on build script; added multi-cuda install script.
|
2022-07-25 19:30:37 -07:00 |
|
Tim Dettmers
|
7d2ecd30c0
|
Fixed rowcol synchronization bug.
|
2022-07-22 15:21:37 -07:00 |
|
Tim Dettmers
|
c771b3a75a
|
Most tests passing.
|
2022-07-22 14:41:05 -07:00 |
|
Tim Dettmers
|
2f8083bd8b
|
Added AdamW. #10 #13
|
2021-11-28 21:18:11 -08:00 |
|
Tim Dettmers
|
8b3c0f355c
|
Added adagrad with tests (no clipping).
|
2021-11-10 15:10:02 -08:00 |
|
Tim Dettmers
|
0fb378b4ee
|
Added compilation from source instructions; easier compilation.
|
2021-10-21 17:22:43 -07:00 |
|
Tim Dettmers
|
a6eae2e7f2
|
Added skip_zeros; tests are passing.
|
2021-10-20 19:15:47 -07:00 |
|
Tim Dettmers
|
bb34fd50a1
|
Initial plumbing for skip_zeros.
|
2021-10-20 18:37:44 -07:00 |
|
Tim Dettmers
|
7439924891
|
Initial commit
|
2021-10-05 19:16:20 -07:00 |
|