bitsandbytes-rocm

History

arlo-phoenix 0b481bfcc2 Use workaround for ROCm wave32 recognition just sets __AMDGCN_WAVEFRONT_SIZE forcefully to 32. Not correct (some GPU's don't support wave32), but works on the supported GPU's. Can disable with DISABLE_WARP_32 With this blockwise quantize works and with that nf4 is supported.		2023-08-08 18:50:26 +00:00
..
common.cpp	Fixed 2^31 max size issue for cpu blockwise quant.	2022-09-11 11:55:09 -07:00
common.h	Fixed 2^31 max size issue for cpu blockwise quant.	2022-09-11 11:55:09 -07:00
cpu_ops.cpp	Remove trailing whitespace & ensure newline at EOF	2022-10-27 13:11:29 +02:00
cpu_ops.h	Fixed 2^31 max size issue for cpu blockwise quant.	2022-09-11 11:55:09 -07:00
kernels.cu	Use workaround for ROCm wave32 recognition	2023-08-08 18:50:26 +00:00
kernels.cuh	Added fp32 compute type for gemv_4bit.	2023-07-09 21:06:01 -07:00
ops.cu	Use workaround for ROCm wave32 recognition	2023-08-08 18:50:26 +00:00
ops.cuh	Use workaround for ROCm wave32 recognition	2023-08-08 18:50:26 +00:00
pythonInterface.c	Add HIP to cuda defines	2023-08-05 02:11:46 +02:00