Commit Graph

71 Commits

Author SHA1 Message Date
Tim Dettmers
412fd0e717 Added better default compute_dtype handling for Linear4bit layers. 2023-07-22 12:56:29 -07:00
Tim Dettmers
67475257a9 Added documentation for NF4; failing 8-bit matmul; fixed absmax bug. #529 #543 2023-07-13 21:41:43 -07:00
Tim Dettmers
5f492d437e Merge remote-tracking branch 'origin/inference' 2023-07-10 06:24:24 -07:00
Tim Dettmers
196d6f5dc1
Merge pull request #469 from shadeMe/linear-layer-device
Add `device` parameter to `Linear` subclasses and `Embedding`
2023-07-10 06:17:13 -07:00
Tim Dettmers
cef519c89e Added test for Param4bit.to() and fixed double quant behavior. 2023-07-09 17:16:50 -07:00
Max Ryabinin
b599fdb197 Only rearrange weight if it exists 2023-06-14 19:27:13 +02:00
Max Ryabinin
c1f3f56d2c Rearrange the weights directly in state dict before loading 2023-06-09 21:58:39 +02:00
Max Ryabinin
f734076e94 Improve memory efficiency of 8-bit serialization 2023-06-09 21:39:57 +02:00
shadeMe
db49ad43ab
Add device parameter to Embedding 2023-06-01 17:43:49 +02:00
shadeMe
9cac5dd1b6
Add device parameter to Linear subclasses 2023-06-01 17:43:30 +02:00
Tim Dettmers
675baa79d2 Merge remote-tracking branch 'origin/main' into merge 2023-05-07 13:34:03 -07:00
Tim Dettmers
84964db937 CUTLASS compiles. 2023-04-25 17:15:51 -07:00
Tim Dettmers
7dc198feb7 Added 32-bit optimizer for bfloat16 gradients. 2023-04-17 18:01:49 -07:00
Tim Dettmers
9e7cdc9ea9 Added last SwitchBack refactors. All tests green. 2023-04-12 13:41:30 -07:00
Tim Dettmers
b8ea2b416d Fixed bias conversion in Linear4bit 2023-04-12 12:28:35 -07:00
Tim Dettmers
5b612bc6df Added is_available_triton guard to Triton SwitchBackLinear. 2023-04-12 12:16:55 -07:00
Tim Dettmers
7140c01405 Merge branch 'main' into fp8_merge 2023-04-12 11:44:39 -07:00
Tim Dettmers
dd562c24f1 Refactored simulated fp8 modules into research.nn. 2023-04-12 11:24:44 -07:00
Tim Dettmers
ec1ea63711 Refactored triton into its own folder. Refactored fp8 matmuls. 2023-04-12 09:39:39 -07:00
Mitchell Wortsman
d677a71607 typo 2023-04-08 19:36:17 +00:00
Mitchell Wortsman
da524d97c9 mem efficient" 2023-04-08 19:34:18 +00:00
Tim Dettmers
e9fa03b717 Some fixed for loading PEFT modules with Params4bit. 2023-04-07 09:59:21 -07:00
Tim Dettmers
1ccb7bdec6 Fixed ParamsIn4 init; fixed PyTorch 2.0 test failure. 2023-04-03 18:47:00 -07:00
Tim Dettmers
4ea489d3bf Refactor FP4 into 4Bit and integrate NF4 data type. 2023-04-03 11:00:12 -07:00
Tim Dettmers
51a21df728 Added 8-bit compression to quantization statistics. 2023-04-01 16:10:18 -07:00
Mitchell Wortsman
7f87ba83ee cleaning and refactor 2023-04-01 18:46:04 +00:00
Tim Dettmers
a13a522c4c Added first triton test. 2023-03-31 11:20:54 -07:00
Mitchell Wortsman
5f3d9ada8d triton-v1 2023-03-29 06:47:08 +00:00
Tim Dettmers
69810521d3 Some small changes. 2023-03-27 09:12:57 -07:00
Mitchell Wortsman
51f8bb7133 pre-triton update 2023-03-24 05:44:42 +00:00
Artidoro Pagnoni
6c31a5fe99 t5 model fix 2023-02-27 14:23:21 -08:00
Max Ryabinin
24609b66af Reduce diff 2023-02-25 06:24:58 +01:00
Max Ryabinin
d15822a54b Refactor _tile_indices into a cached property, fix device bug 2023-02-25 06:23:07 +01:00
Max Ryabinin
cc608c04c2 Revert the layout if weights were reordered 2023-02-25 06:02:06 +01:00
Max Ryabinin
cd4d904a4c Raise an error when loading a quantized checkpoint before quantization 2023-02-25 06:01:34 +01:00
Tim Dettmers
9851a10b46 Added cast to fp4 layer for speed. 2023-02-24 10:17:57 -08:00
Mitchell Wortsman
75377d125e new experiments 2023-02-24 00:10:15 +00:00
Mitchell Wortsman
3fbf60ad83 sim now worse than real 2023-02-23 08:27:15 +00:00
Max Ryabinin
58b09ee1b1 [WIP] Implement proper serialization of Linear8bitLt 2023-02-21 12:04:47 +01:00
Mitchell Wortsman
7b764d3569 adding half() cast 2023-02-21 03:53:44 +00:00
Tim Dettmers
c93a90d075 Fixed FP4 import and data type conversion in backward. 2023-02-14 13:31:39 -08:00
Tim Dettmers
2dfa3ce16d Fixed LinearFP8 and added tests. 2023-02-13 17:48:52 -08:00
Tim Dettmers
fa255cbc56 Added missing import. 2023-02-13 17:29:39 -08:00
Tim Dettmers
ca3236587a Added forward/backward tests; removed bias. 2023-02-13 17:20:52 -08:00
Tim Dettmers
6bdb6c351e Added fp8 simulation layer. 2023-02-13 16:53:07 -08:00
Tim Dettmers
c0c352b379 Added bias test for LinearFP4 and basic test. 2023-02-05 06:29:52 -08:00
Tim Dettmers
160a83580d Forward matmul_fp4 tests pass. 2023-02-04 21:11:21 -08:00
Tim Dettmers
de53588934 Added Int8 matmul support for all GPUs. Full backward support. 2023-02-01 20:09:31 -08:00
Tim Dettmers
c9f505064e Added outlier detector and fake quantization layer. 2023-01-28 17:05:22 -08:00
Tim Dettmers
9d353ca786
Merge pull request #87 from lostmsu/main
Add `device` and `dtype` parameters to `StableEmbedding`
2023-01-02 13:22:45 +01:00