Commit Graph

410 Commits

Author SHA1 Message Date
Phil Wang
aa9b939edd add some comments, and fix use of g_val 2023-03-22 09:22:19 -07:00
Phil Wang
a43cd2008d add some code in test_optim.py, although it seems to be failing 2023-03-22 09:14:05 -07:00
Phil Wang
9b656f461a follow advice of Tim to fix update of momentum vs parameters in blockwise 8 bit 2023-03-22 07:52:59 -07:00
Max Ryabinin
dcecbb26ca Add force_no_igemmlt to test params 2023-03-22 00:28:49 +01:00
Tim Dettmers
49a04253fb Bumped version for CUDA 12.1 support release. 2023-03-21 15:10:19 -07:00
Tim Dettmers
d032618d7f
Merge pull request #180 from ubik2/patch-1
Update compile_from_source.md to mention cuda12x target
2023-03-21 14:08:32 -07:00
Tim Dettmers
1b0aabc7e4 Added CUDA 12.1. addressing #201 2023-03-21 14:06:08 -07:00
Tim Dettmers
2c8352e316 Bumped version. 2023-03-12 10:24:25 -07:00
Tim Dettmers
ec5fbf4cc4
Merge pull request #115 from kashif/patch-1
Fix for python 3.7
2023-03-12 10:22:15 -07:00
Severin Gsponer
c4866ab06e Fix #157; Add XDG_GREETER_DATA_DIR to ignorelist 2023-03-11 15:35:23 +01:00
Phil Wang
369a51c432 switch all eps to beta2 2023-03-10 14:08:35 -08:00
Phil Wang
6c377b39b6 always pass beta2 into all the 1state functions 2023-03-10 13:00:59 -08:00
Phil Wang
abbe65adfc beta2 is actually accessible in kOptimizerStatic8bit1StateBlockwise 2023-03-10 12:50:14 -08:00
Phil Wang
19b9ef34b9 whoops 2023-03-10 08:59:49 -08:00
Phil Wang
c99b44f774 do the epsilon beta2 switcharoo within the cuda code, and not within the python class (so that the state dict still makes sense) 2023-03-10 08:57:59 -08:00
Phil Wang
8618bed001 swap the order in which momentum and parameters are updated in ops.cu 2023-03-10 08:39:06 -08:00
Phil Wang
c5582724d5 missed adagrad 2023-03-09 14:05:45 -08:00
Phil Wang
af03430992 fix weight decay for lion to be decoupled, using a switch 2023-03-09 14:03:07 -08:00
Phil Wang
ead570a43e remove something rmsprop specific 2023-03-09 11:58:31 -08:00
Phil Wang
c83888aa1a use epsilon as beta2 for lion, complete most of the logic in kernel.cu for all functions 2023-03-09 11:54:54 -08:00
Phil Wang
64bb1ae8d1 add a sign function, for lion 2023-03-09 11:10:28 -08:00
Phil Wang
8de29fc364 forget about tests for now, will test live on local enwik8 training 2023-03-09 10:11:32 -08:00
Phil Wang
cb4c3c8c66 do a bunch of typical bookkeeping before getting to main lion logic 2023-03-09 10:10:19 -08:00
Phil Wang
d43ea9722c make sure interface is correct 2023-03-09 09:45:33 -08:00
Phil Wang
7247cb4554 initial commit, slowly work from interface into the kernel 2023-03-09 08:08:46 -08:00
ubik2
dba11b0b2e
Update compile_from_source.md
Add cuda12x to the list of targets
2023-03-06 16:57:57 -08:00
Artidoro Pagnoni
6c31a5fe99 t5 model fix 2023-02-27 14:23:21 -08:00
Max Ryabinin
24609b66af Reduce diff 2023-02-25 06:24:58 +01:00
Max Ryabinin
d15822a54b Refactor _tile_indices into a cached property, fix device bug 2023-02-25 06:23:07 +01:00
Max Ryabinin
cc608c04c2 Revert the layout if weights were reordered 2023-02-25 06:02:06 +01:00
Max Ryabinin
cd4d904a4c Raise an error when loading a quantized checkpoint before quantization 2023-02-25 06:01:34 +01:00
Max Ryabinin
ac3ab281e3 Handle more cases in test_linear_serialization 2023-02-25 06:01:04 +01:00
Tim Dettmers
9851a10b46 Added cast to fp4 layer for speed. 2023-02-24 10:17:57 -08:00
Mitchell Wortsman
75377d125e new experiments 2023-02-24 00:10:15 +00:00
Tim Dettmers
5d2e23e8d6 Merge branch 'fp8sim' of github.com:TimDettmers/bitsandbytes into fp8sim 2023-02-23 10:56:49 -08:00
Tim Dettmers
c5c38ca19c Added matmul_mixed. 2023-02-23 10:45:18 -08:00
Mitchell Wortsman
3fbf60ad83 sim now worse than real 2023-02-23 08:27:15 +00:00
Max Ryabinin
58b09ee1b1 [WIP] Implement proper serialization of Linear8bitLt 2023-02-21 12:04:47 +01:00
Mitchell Wortsman
7b764d3569 adding half() cast 2023-02-21 03:53:44 +00:00
Tim Dettmers
2489d819c5 Added more blocksizes for stochastic rounding; fixed dequant blocksize. 2023-02-14 13:55:17 -08:00
Tim Dettmers
c93a90d075 Fixed FP4 import and data type conversion in backward. 2023-02-14 13:31:39 -08:00
Tim Dettmers
2dfa3ce16d Fixed LinearFP8 and added tests. 2023-02-13 17:48:52 -08:00
Tim Dettmers
fa255cbc56 Added missing import. 2023-02-13 17:29:39 -08:00
Tim Dettmers
ca3236587a Added forward/backward tests; removed bias. 2023-02-13 17:20:52 -08:00
Tim Dettmers
6bdb6c351e Added fp8 simulation layer. 2023-02-13 16:53:07 -08:00
Tim Dettmers
7f0773aede Added backprop test for Linear8bitLt and LinearFP4. 2023-02-05 06:49:54 -08:00
Tim Dettmers
c0c352b379 Added bias test for LinearFP4 and basic test. 2023-02-05 06:29:52 -08:00
Tim Dettmers
c361f84239 Fixed matmul_fp4 transpose. 2023-02-05 06:16:56 -08:00
Tim Dettmers
cfe4705e32 Added matmul_fp4 to the benchmark. 2023-02-04 22:00:04 -08:00
Tim Dettmers
13c0a4dc5d Backward matmul_fp4 passes. 2023-02-04 21:35:43 -08:00