Commit Graph

95 Commits

Author SHA1 Message Date
arlo-phoenix
40361ecfbb Adapt python to work with HIP 2023-08-05 02:12:48 +02:00
Tim Dettmers
6689afaec4
Merge pull request #567 from apbard/patch-1
[BugFix] replace view+continuous with reshape
2023-07-13 21:45:00 -07:00
Tim Dettmers
90b0ac57b0 Fixed missing bias in bnb.matmul_4bit for inference; more tests. 2023-07-11 17:13:33 -07:00
Tim Dettmers
ba51d95d43 Added more extensive gemv tests; blocksize guard for gemv. 2023-07-11 05:55:49 -07:00
Tim Dettmers
5f492d437e Merge remote-tracking branch 'origin/inference' 2023-07-10 06:24:24 -07:00
Tim Dettmers
94168d79d7 Added FP4 fast inference support. 2023-07-09 14:46:19 -07:00
Tim Dettmers
4b88d69de7 Added abitrary data types; fixed a bug for small matrices. 2023-07-09 12:04:09 -07:00
Alessandro Pietro Bardelli
463630dc73
[BugFix] replace view+continuous with reshape 2023-07-06 12:26:03 +02:00
Tim Dettmers
02fd80cb81 Added bfloat16 quantizations and tests. 2023-07-04 19:58:31 -07:00
Max Ryabinin
4fb37d45c1 Extract get_tile_inds to a separate function 2023-06-09 21:39:37 +02:00
Tim Dettmers
4bd1151829 Fixed gradient accumulation test. 2023-05-07 15:06:17 -07:00
Tim Dettmers
675baa79d2 Merge remote-tracking branch 'origin/main' into merge 2023-05-07 13:34:03 -07:00
Tim Dettmers
7140c01405 Merge branch 'main' into fp8_merge 2023-04-12 11:44:39 -07:00
Tim Dettmers
ec1ea63711 Refactored triton into its own folder. Refactored fp8 matmuls. 2023-04-12 09:39:39 -07:00
Tim Dettmers
ed6f3eb146
Merge pull request #159 from TimDettmers/serialize_8bit
Implement proper serialization of Linear8bitLt
2023-04-11 07:24:51 -07:00
justheuristic
5e456be50e
Support 1650, 1660 2023-04-10 21:26:52 +03:00
Tim Dettmers
4ea489d3bf Refactor FP4 into 4Bit and integrate NF4 data type. 2023-04-03 11:00:12 -07:00
Mitchell Wortsman
51f8bb7133 pre-triton update 2023-03-24 05:44:42 +00:00
Max Ryabinin
d15822a54b Refactor _tile_indices into a cached property, fix device bug 2023-02-25 06:23:07 +01:00
Tim Dettmers
9851a10b46 Added cast to fp4 layer for speed. 2023-02-24 10:17:57 -08:00
Tim Dettmers
5d2e23e8d6 Merge branch 'fp8sim' of github.com:TimDettmers/bitsandbytes into fp8sim 2023-02-23 10:56:49 -08:00
Tim Dettmers
c5c38ca19c Added matmul_mixed. 2023-02-23 10:45:18 -08:00
Mitchell Wortsman
3fbf60ad83 sim now worse than real 2023-02-23 08:27:15 +00:00
Mitchell Wortsman
7b764d3569 adding half() cast 2023-02-21 03:53:44 +00:00
Tim Dettmers
2489d819c5 Added more blocksizes for stochastic rounding; fixed dequant blocksize. 2023-02-14 13:55:17 -08:00
Tim Dettmers
c93a90d075 Fixed FP4 import and data type conversion in backward. 2023-02-14 13:31:39 -08:00
Tim Dettmers
ca3236587a Added forward/backward tests; removed bias. 2023-02-13 17:20:52 -08:00
Tim Dettmers
6bdb6c351e Added fp8 simulation layer. 2023-02-13 16:53:07 -08:00
Tim Dettmers
c361f84239 Fixed matmul_fp4 transpose. 2023-02-05 06:16:56 -08:00
Tim Dettmers
cfe4705e32 Added matmul_fp4 to the benchmark. 2023-02-04 22:00:04 -08:00
Tim Dettmers
13c0a4dc5d Backward matmul_fp4 passes. 2023-02-04 21:35:43 -08:00
Tim Dettmers
160a83580d Forward matmul_fp4 tests pass. 2023-02-04 21:11:21 -08:00
Tim Dettmers
de53588934 Added Int8 matmul support for all GPUs. Full backward support. 2023-02-01 20:09:31 -08:00
Tom Aarsen
697bd02c60 Resolve dangerous default value [] as argument 2022-10-27 13:25:51 +02:00
Tom Aarsen
7a3c9af05d Sort imports
Via isort
2022-10-27 13:15:21 +02:00
Tom Aarsen
0b078403ee Simplify statements into equivalent, modern variants
via pyupgrade --py37-plus. The changes e.g. are subclassing from object, calling super() with super(ThisClass, self), or old-style syntax formatting.
2022-10-27 13:14:13 +02:00
Tom Aarsen
1eec77d34c Remove trailing whitespace & ensure newline at EOF 2022-10-27 13:11:29 +02:00
Tim Dettmers
9b7d307b8c review 2022-09-20 06:36:32 +03:00
justheuristic
5d65817101 debug 2022-09-18 01:09:24 +03:00
justheuristic
4da2227fcb debug 2022-09-18 01:03:21 +03:00
justheuristic
4b4a9effd1 debugprint 2022-09-18 01:02:13 +03:00
justheuristic
7906dc4c9a debugpritn 2022-09-18 00:57:26 +03:00
justheuristic
591f60395a add memory efficient backward 2022-09-18 00:52:53 +03:00
justheuristic
579b8c782f reduce diff 2022-09-18 00:47:58 +03:00
justheuristic
76ece2c126 rollback 2022-09-18 00:43:56 +03:00
justheuristic
18f142e268 addmm_ 2022-09-18 00:43:02 +03:00
justheuristic
ab9dee062d cast edge case 2022-09-18 00:36:46 +03:00
justheuristic
cbfdf0b5ef cast edge case 2022-09-18 00:35:42 +03:00
justheuristic
e35e2c665a cast properly 2022-09-18 00:35:03 +03:00
justheuristic
577275bd8c cast properly 2022-09-18 00:30:57 +03:00