bitsandbytes-rocm

Author	SHA1	Message	Date
Tim Dettmers	de53588934	Added Int8 matmul support for all GPUs. Full backward support.	2023-02-01 20:09:31 -08:00
Tim Dettmers	336e24696c	CUDASetup only executed once + fixed circular import.	2023-01-02 03:31:43 -08:00
Tim Dettmers	c91f592ad7	Merge branch 'main' into cleanup	2023-01-02 11:19:16 +01:00
Tim Dettmers	eb028e6ebc	Fixed k-bit quantization maps.	2022-11-19 07:24:03 -08:00
Tom Aarsen	b104ce3b62	Merge branch 'main' into cleanup	2022-11-17 15:22:29 +01:00
Tim Dettmers	08fa2e7b01	Fixed bug in cpu quant; faster GPU dequant.	2022-11-07 18:06:18 -08:00
Tim Dettmers	e0e697b150	Fixed blockwise test and logic.	2022-11-06 16:36:31 -08:00
Tim Dettmers	6bc2b992be	Added blocksizes 2048, 1024, and 512 to blockwise quant.	2022-11-06 16:27:48 -08:00
Tim Dettmers	2f2063bac2	Added k<256 quantile estimate.	2022-11-06 13:05:25 -08:00
Tim Dettmers	98cbc4bc4f	Added k-bit fp8 map.	2022-11-06 11:59:37 -08:00
Tim Dettmers	caf1832526	Added k-bit linear quantization.	2022-11-06 11:47:54 -08:00
Tim Dettmers	1efb87d89d	Added FP8 quantization map.	2022-11-03 19:49:50 -07:00
Tom Aarsen	7a3c9af05d	Sort imports Via isort	2022-10-27 13:15:21 +02:00
Tom Aarsen	0b078403ee	Simplify statements into equivalent, modern variants via pyupgrade --py37-plus. The changes e.g. are subclassing from object, calling super() with super(ThisClass, self), or old-style syntax formatting.	2022-10-27 13:14:13 +02:00
Tom Aarsen	1eec77d34c	Remove trailing whitespace & ensure newline at EOF	2022-10-27 13:11:29 +02:00
Tim Dettmers	a371be302d	Added CUDA SETUP instruction generator.	2022-10-25 08:01:19 -07:00
Tim Dettmers	df86625a93	Isolated CUDASetup logging; all tests green.	2022-10-24 11:54:25 -07:00
justheuristic	76ce9aa6da	try fp32	2022-09-20 06:51:25 +03:00
Tim Dettmers	292a478716	set threshold	2022-09-20 06:42:05 +03:00
justheuristic	a07825ac31	review	2022-09-20 06:40:36 +03:00
justheuristic	cff3a71599	cast device	2022-09-18 01:26:25 +03:00
justheuristic	32a9a88f98	cast device	2022-09-18 01:26:12 +03:00
justheuristic	01b4c6a048	cast device	2022-09-18 01:25:56 +03:00
justheuristic	e4086a2758	cast device	2022-09-18 01:24:57 +03:00
justheuristic	725cc72993	cast device	2022-09-18 01:24:44 +03:00
justheuristic	28a9313ddc	cast before allclose	2022-09-18 01:24:27 +03:00
justheuristic	95dafc6475	cast before allclose	2022-09-18 01:22:31 +03:00
justheuristic	37f805bb44	debug	2022-09-18 01:21:12 +03:00
justheuristic	6a826c41a6	pre-cast	2022-09-18 01:20:34 +03:00
justheuristic	d9b8789818	debug	2022-09-18 01:13:58 +03:00
justheuristic	2cd047e35d	run backward	2022-09-18 00:55:53 +03:00
justheuristic	591f60395a	add memory efficient backward	2022-09-18 00:52:53 +03:00
justheuristic	f6670329fb	bump threshold to 0.21	2022-09-18 00:42:23 +03:00
justheuristic	fa8e07c7c5	more lenient threshold	2022-09-18 00:38:02 +03:00
justheuristic	e35e2c665a	cast properly	2022-09-18 00:35:03 +03:00
justheuristic	d9ca0ed905	un-fuse bias	2022-09-17 23:44:28 +03:00
justheuristic	7facedda38	copypaste tolerances	2022-09-17 23:41:40 +03:00
justheuristic	e29c5f5c41	clearer assertions	2022-09-17 23:22:04 +03:00
justheuristic	9379df85d2	check dtypes first	2022-09-17 23:13:23 +03:00
justheuristic	140cdbe876	check dtypes first	2022-09-17 23:12:58 +03:00
justheuristic	a9c7953e0a	cast to half before double_quant	2022-09-17 23:10:21 +03:00
justheuristic	469d5a631d	test_bf16	2022-09-17 23:06:57 +03:00
Tim Dettmers	c05dd42ddd	Fixed cpu blockwise quantization for small input tensors.	2022-09-13 10:37:53 -07:00
Tim Dettmers	19a7adca7a	Fixed 2^31 max size issue for cpu blockwise quant.	2022-09-11 11:55:09 -07:00
Tim Dettmers	7e0fb655e1	Some initial code. Needs to be tested.	2022-08-23 13:59:34 -07:00
Tim Dettmers	9d60b3c527	Fixed bug in Linear8bitLt, when the bias is None.	2022-08-17 03:45:57 -07:00
Tim Dettmers	de354f7ded	Added fused bias to matmullt.	2022-08-16 12:00:54 -07:00
Tim Dettmers	dede343033	Added fused bias in dequant_mm.	2022-08-16 11:12:09 -07:00
Tim Dettmers	1ed2fa2f21	Removed storage() from get_ptr; added boilerplate for bias dequant_mm.	2022-08-16 10:56:17 -07:00
Tim Dettmers	c472bd56f0	Added the case that all env variables are empty (CUDA docker).	2022-08-05 08:57:52 -07:00

1 2

75 Commits