Tim Dettmers
|
b0ec20c3b3
|
Merge pull request #188 from lucidrains/main
Lion 8 bit
|
2023-04-11 07:22:45 -07:00 |
|
Tim Dettmers
|
d3e0e39def
|
Merge pull request #190 from svgsponer/Fix#157
Fix #157; Add XDG_GREETER_DATA_DIR to ignorelist
|
2023-04-11 07:20:16 -07:00 |
|
Tim Dettmers
|
c7875533ce
|
Merge pull request #213 from tonylins/dev/fix_no_absmax
Gix a bug in (de)quantize_no_absmax with multiple GPUs
|
2023-04-11 07:18:24 -07:00 |
|
Tim Dettmers
|
6b4c5afe21
|
Merge pull request #260 from rapsealk/fix_libsbitsandbytes_cpu_so
Fixed typo libsbitsandbytes_cpu.so
|
2023-04-11 07:15:42 -07:00 |
|
Tim Dettmers
|
72efa32962
|
Merge pull request #292 from justheuristic/patch-2
Support nvidia16 GPUs
|
2023-04-11 07:14:12 -07:00 |
|
justheuristic
|
5e456be50e
|
Support 1650, 1660
|
2023-04-10 21:26:52 +03:00 |
|
Jeongseok Kang
|
8cceff72db
|
Fixed typo libsbitsandbytes_cpu.so
|
2023-04-05 09:28:41 +09:00 |
|
Ji Lin
|
b6383ba116
|
fix a bug in quantize_no_absmax and dequantize_no_absmax with multiple gpus
|
2023-03-22 22:14:57 -04:00 |
|
Phil Wang
|
2a6828e6fb
|
fix comment
|
2023-03-22 09:56:50 -07:00 |
|
Phil Wang
|
978ba2db57
|
another tab/spaces fix
|
2023-03-22 09:33:47 -07:00 |
|
Phil Wang
|
916000c8bf
|
fix consistent tabs / spaces
|
2023-03-22 09:27:13 -07:00 |
|
Phil Wang
|
aa9b939edd
|
add some comments, and fix use of g_val
|
2023-03-22 09:22:19 -07:00 |
|
Phil Wang
|
a43cd2008d
|
add some code in test_optim.py, although it seems to be failing
|
2023-03-22 09:14:05 -07:00 |
|
Phil Wang
|
9b656f461a
|
follow advice of Tim to fix update of momentum vs parameters in blockwise 8 bit
|
2023-03-22 07:52:59 -07:00 |
|
Tim Dettmers
|
49a04253fb
|
Bumped version for CUDA 12.1 support release.
|
2023-03-21 15:10:19 -07:00 |
|
Tim Dettmers
|
d032618d7f
|
Merge pull request #180 from ubik2/patch-1
Update compile_from_source.md to mention cuda12x target
|
2023-03-21 14:08:32 -07:00 |
|
Tim Dettmers
|
1b0aabc7e4
|
Added CUDA 12.1. addressing #201
|
2023-03-21 14:06:08 -07:00 |
|
Tim Dettmers
|
2c8352e316
|
Bumped version.
|
2023-03-12 10:24:25 -07:00 |
|
Tim Dettmers
|
ec5fbf4cc4
|
Merge pull request #115 from kashif/patch-1
Fix for python 3.7
|
2023-03-12 10:22:15 -07:00 |
|
Severin Gsponer
|
c4866ab06e
|
Fix #157; Add XDG_GREETER_DATA_DIR to ignorelist
|
2023-03-11 15:35:23 +01:00 |
|
Phil Wang
|
369a51c432
|
switch all eps to beta2
|
2023-03-10 14:08:35 -08:00 |
|
Phil Wang
|
6c377b39b6
|
always pass beta2 into all the 1state functions
|
2023-03-10 13:00:59 -08:00 |
|
Phil Wang
|
abbe65adfc
|
beta2 is actually accessible in kOptimizerStatic8bit1StateBlockwise
|
2023-03-10 12:50:14 -08:00 |
|
Phil Wang
|
19b9ef34b9
|
whoops
|
2023-03-10 08:59:49 -08:00 |
|
Phil Wang
|
c99b44f774
|
do the epsilon beta2 switcharoo within the cuda code, and not within the python class (so that the state dict still makes sense)
|
2023-03-10 08:57:59 -08:00 |
|
Phil Wang
|
8618bed001
|
swap the order in which momentum and parameters are updated in ops.cu
|
2023-03-10 08:39:06 -08:00 |
|
Phil Wang
|
c5582724d5
|
missed adagrad
|
2023-03-09 14:05:45 -08:00 |
|
Phil Wang
|
af03430992
|
fix weight decay for lion to be decoupled, using a switch
|
2023-03-09 14:03:07 -08:00 |
|
Phil Wang
|
ead570a43e
|
remove something rmsprop specific
|
2023-03-09 11:58:31 -08:00 |
|
Phil Wang
|
c83888aa1a
|
use epsilon as beta2 for lion, complete most of the logic in kernel.cu for all functions
|
2023-03-09 11:54:54 -08:00 |
|
Phil Wang
|
64bb1ae8d1
|
add a sign function, for lion
|
2023-03-09 11:10:28 -08:00 |
|
Phil Wang
|
8de29fc364
|
forget about tests for now, will test live on local enwik8 training
|
2023-03-09 10:11:32 -08:00 |
|
Phil Wang
|
cb4c3c8c66
|
do a bunch of typical bookkeeping before getting to main lion logic
|
2023-03-09 10:10:19 -08:00 |
|
Phil Wang
|
d43ea9722c
|
make sure interface is correct
|
2023-03-09 09:45:33 -08:00 |
|
Phil Wang
|
7247cb4554
|
initial commit, slowly work from interface into the kernel
|
2023-03-09 08:08:46 -08:00 |
|
ubik2
|
dba11b0b2e
|
Update compile_from_source.md
Add cuda12x to the list of targets
|
2023-03-06 16:57:57 -08:00 |
|
Kashif Rasul
|
c52365ac1d
|
Merge branch 'main' into patch-1
|
2023-02-03 09:01:48 +01:00 |
|
Tim Dettmers
|
0f5c394870
|
Added version 0.37.0.
|
2023-02-01 20:27:01 -08:00 |
|
Tim Dettmers
|
de53588934
|
Added Int8 matmul support for all GPUs. Full backward support.
|
2023-02-01 20:09:31 -08:00 |
|
Tim Dettmers
|
92ab6a8d5f
|
Merge pull request #119 from stas00/patch-1
improve install instructions
|
2023-02-01 19:21:36 -08:00 |
|
Stas Bekman
|
c5372a8567
|
improve install instructions
|
2023-01-05 13:34:51 -08:00 |
|
Kashif Rasul
|
59bf8fcff2
|
fix CUDASetup call
|
2023-01-04 17:47:18 +01:00 |
|
Kashif Rasul
|
792f6213a7
|
Fix for python 3.7
|
2023-01-04 17:38:33 +01:00 |
|
Tim Dettmers
|
1341fb44ad
|
Fixed issue where the CUDA SETUP was not printed.
|
2023-01-04 03:50:53 -08:00 |
|
Tim Dettmers
|
3901ebf7ae
|
Added CUDA 12.0 support; removed CC 3.0 support.
|
2023-01-04 02:28:33 -08:00 |
|
Tim Dettmers
|
b3de19218e
|
Added error message for unexpected CUDA exception.
|
2023-01-03 06:57:07 -08:00 |
|
Tim Dettmers
|
81990491ff
|
Merge pull request #113 from Borzik/fix-warnings
Import missing warn function
|
2023-01-03 15:46:58 +01:00 |
|
Tim Dettmers
|
9180b4cc11
|
Added additional error message for cudart error #85
|
2023-01-03 06:44:11 -08:00 |
|
Tim Dettmers
|
dfb049f8e4
|
Added Python >= 3.8 requirement.
|
2023-01-03 06:20:06 -08:00 |
|
Tim Dettmers
|
211ad594df
|
Added error+instructions for unsupported CUDA 10.0 version #82
|
2023-01-03 06:07:35 -08:00 |
|