bitsandbytes-rocm

Author	SHA1	Message	Date
Tim Dettmers	7c651012fc	Added better error message for debugging on CUDA not detected failures.	2023-04-12 07:56:52 -07:00
Tim Dettmers	659a7dfc71	Fixing #300 .	2023-04-11 16:14:29 -07:00
Tim Dettmers	eb1c331c84	Updates README and CHANGELOG.	2023-04-11 15:49:01 -07:00
Tim Dettmers	89e3b82731	Added more detailed cuda setup debug and debugging instructions.	2023-04-11 13:47:10 -07:00
Tim Dettmers	4cd63deff3	Fixed CUDA Conda PyTorch 2.0 issues.	2023-04-11 12:10:20 -07:00
Tim Dettmers	2bb5c00ba9	Added pre/post call to all lib calls. Fixes #120	2023-04-11 09:36:56 -07:00
Tim Dettmers	29ab3a6b14	Updated change log.	2023-04-11 09:26:52 -07:00
Tim Dettmers	2eb3108356	Fixed bug where beta2 was not passed into Lion 32-bit.	2023-04-11 09:16:01 -07:00
Tim Dettmers	792af5c883	Fixed noisy tests for 8-bit Lion.	2023-04-11 08:42:41 -07:00
Tim Dettmers	0b2ebcdab9	Added launch bounds to fix launch resource error for Lion.	2023-04-11 08:37:02 -07:00
Tim Dettmers	ed6f3eb146	Merge pull request #159 from TimDettmers/serialize_8bit Implement proper serialization of Linear8bitLt	2023-04-11 07:24:51 -07:00
Tim Dettmers	b0ec20c3b3	Merge pull request #188 from lucidrains/main Lion 8 bit	2023-04-11 07:22:45 -07:00
Tim Dettmers	d3e0e39def	Merge pull request #190 from svgsponer/Fix#157 Fix #157; Add XDG_GREETER_DATA_DIR to ignorelist	2023-04-11 07:20:16 -07:00
Tim Dettmers	c7875533ce	Merge pull request #213 from tonylins/dev/fix_no_absmax Gix a bug in (de)quantize_no_absmax with multiple GPUs	2023-04-11 07:18:24 -07:00
Tim Dettmers	6b4c5afe21	Merge pull request #260 from rapsealk/fix_libsbitsandbytes_cpu_so Fixed typo libsbitsandbytes_cpu.so	2023-04-11 07:15:42 -07:00
Tim Dettmers	72efa32962	Merge pull request #292 from justheuristic/patch-2 Support nvidia16 GPUs	2023-04-11 07:14:12 -07:00
justheuristic	5e456be50e	Support 1650, 1660	2023-04-10 21:26:52 +03:00
Mitchell Wortsman	d677a71607	typo	2023-04-08 19:36:17 +00:00
Mitchell Wortsman	da524d97c9	mem efficient"	2023-04-08 19:34:18 +00:00
Tim Dettmers	e9fa03b717	Some fixed for loading PEFT modules with Params4bit.	2023-04-07 09:59:21 -07:00
Jeongseok Kang	8cceff72db	Fixed typo libsbitsandbytes_cpu.so	2023-04-05 09:28:41 +09:00
Tim Dettmers	1ccb7bdec6	Fixed ParamsIn4 init; fixed PyTorch 2.0 test failure.	2023-04-03 18:47:00 -07:00
Tim Dettmers	4ea489d3bf	Refactor FP4 into 4Bit and integrate NF4 data type.	2023-04-03 11:00:12 -07:00
Tim Dettmers	64cc05920d	First draft of NF4.	2023-04-02 16:10:35 -07:00
Tim Dettmers	4ad999d144	Added quantization tree generation.	2023-04-02 14:42:45 -07:00
Tim Dettmers	0d332a641f	Added normal with extra value.	2023-04-02 14:09:08 -07:00
Tim Dettmers	2dd5d69056	Generalized FP4 data type.	2023-04-02 12:42:01 -07:00
Mitchell Wortsman	eb6c53cf55	clarify in readme	2023-04-01 23:50:12 +00:00
Tim Dettmers	51a21df728	Added 8-bit compression to quantization statistics.	2023-04-01 16:10:18 -07:00
Mitchell Wortsman	2331212b35	add readme for speed bench	2023-04-01 19:13:15 +00:00
Mitchell Wortsman	7f87ba83ee	cleaning and refactor	2023-04-01 18:46:04 +00:00
Tim Dettmers	c4cfe4fbdd	Added bf16 Adam.	2023-04-01 10:33:03 -07:00
Tim Dettmers	30d21d585c	Added triton test.	2023-03-31 11:33:26 -07:00
Tim Dettmers	a13a522c4c	Added first triton test.	2023-03-31 11:20:54 -07:00
Tim Dettmers	8645d1f71c	Added normal quant.	2023-03-29 18:41:37 -07:00
Mitchell Wortsman	b373034e31	test	2023-03-29 19:04:53 +00:00
Mitchell Wortsman	5f3d9ada8d	triton-v1	2023-03-29 06:47:08 +00:00
Tim Dettmers	69810521d3	Some small changes.	2023-03-27 09:12:57 -07:00
Mitchell Wortsman	51f8bb7133	pre-triton update	2023-03-24 05:44:42 +00:00
Ji Lin	b6383ba116	fix a bug in quantize_no_absmax and dequantize_no_absmax with multiple gpus	2023-03-22 22:14:57 -04:00
Phil Wang	2a6828e6fb	fix comment	2023-03-22 09:56:50 -07:00
Phil Wang	978ba2db57	another tab/spaces fix	2023-03-22 09:33:47 -07:00
Phil Wang	916000c8bf	fix consistent tabs / spaces	2023-03-22 09:27:13 -07:00
Phil Wang	aa9b939edd	add some comments, and fix use of g_val	2023-03-22 09:22:19 -07:00
Phil Wang	a43cd2008d	add some code in test_optim.py, although it seems to be failing	2023-03-22 09:14:05 -07:00
Phil Wang	9b656f461a	follow advice of Tim to fix update of momentum vs parameters in blockwise 8 bit	2023-03-22 07:52:59 -07:00
Max Ryabinin	dcecbb26ca	Add force_no_igemmlt to test params	2023-03-22 00:28:49 +01:00
Tim Dettmers	49a04253fb	Bumped version for CUDA 12.1 support release.	2023-03-21 15:10:19 -07:00
Tim Dettmers	d032618d7f	Merge pull request #180 from ubik2/patch-1 Update compile_from_source.md to mention cuda12x target	2023-03-21 14:08:32 -07:00
Tim Dettmers	1b0aabc7e4	Added CUDA 12.1. addressing #201	2023-03-21 14:06:08 -07:00

1 2 3 4 5 ...

403 Commits