arlo-phoenix
e38b9e91b7
Revert get_cuda_version ROCM version change
...
not called anymore
2023-08-08 21:31:20 +02:00
arlo-phoenix
c97c78bd66
Update README rocm quickstart
2023-08-08 21:28:37 +02:00
arlo-phoenix
0b481bfcc2
Use workaround for ROCm wave32 recognition
...
just sets __AMDGCN_WAVEFRONT_SIZE forcefully to 32.
Not correct (some GPU's don't support wave32), but works
on the supported GPU's. Can disable with DISABLE_WARP_32
With this blockwise quantize works and with that nf4 is supported.
2023-08-08 18:50:26 +00:00
arlo-phoenix
615d47583f
README: Add quickstart and info section
2023-08-05 02:42:13 +02:00
arlo-phoenix
705bc024d2
Makefile: Add make hip
2023-08-05 02:41:58 +02:00
arlo-phoenix
40361ecfbb
Adapt python to work with HIP
2023-08-05 02:12:48 +02:00
arlo-phoenix
3682106eb0
Algo-Direct2.h: fix hipcc issue
...
from https://github.com/agrocylo/bitsandbytes-rocm , thanks
2023-08-05 02:12:14 +02:00
arlo-phoenix
d10197bc93
Add HIP to cuda defines
...
collected by hipifying all files and then comparing with original
Cuda file
2023-08-05 02:11:46 +02:00
Tim Dettmers
18e827d666
Version 0.41.1.
2023-08-03 20:01:10 -07:00
Tim Dettmers
3c9aca9124
Fixed two bugs in dynamic data type creation.
2023-08-03 19:47:15 -07:00
Tim Dettmers
a06a0f6a08
Bumped version for new release.
2023-07-22 13:07:08 -07:00
Tim Dettmers
412fd0e717
Added better default compute_dtype handling for Linear4bit layers.
2023-07-22 12:56:29 -07:00
Tim Dettmers
c82f51c0f7
Increased occupancy.
2023-07-19 16:08:37 -07:00
Tim Dettmers
e229fbce66
Added latest changes.
2023-07-16 21:23:57 -07:00
Tim Dettmers
7be5f2c7b3
Guard for prefetchAsync GPU capability. #470 #451 #477
2023-07-16 21:12:03 -07:00
Tim Dettmers
f3232d1391
Fixed bug where read-permission was assumed for a file. #497
2023-07-16 21:08:13 -07:00
Tim Dettmers
37c25c1e0d
Merge branch 'main' of github.com:TimDettmers/bitsandbytes into main
2023-07-15 10:22:45 -07:00
Tim Dettmers
f4996978db
Added missing check if LD_LIBRARY_PATH exists. #588
2023-07-15 10:22:08 -07:00
Tim Dettmers
6102029ab9
Merge pull request #587 from BramVanroy/patch-1
...
replace private with public https repo URL
2023-07-15 10:04:34 -07:00
Tim Dettmers
67a3cdf652
Merge pull request #595 from ihsanturk/FIX-__main__.py-REFERENCE-TO-NONEXISTENT-get_cuda_lib_handle
...
Fix import crash caused by __main__.py reference to nonexistent cuda_setup.main.get_cuda_lib_handle
2023-07-15 10:04:15 -07:00
ihsanturk
ce126d462d
deleted references to get_cuda_lib_handle
2023-07-15 02:49:57 -07:00
ihsanturk
2f0f0e5dba
get_cuda_lib_handle brought back so import works
2023-07-15 02:24:46 -07:00
Tim Dettmers
6ec4f0c374
Changed CUDA_INSTALL variable to BNB_CUDA_INSTALL.
2023-07-14 18:16:45 -07:00
Tim Dettmers
8cdec888b1
Merge pull request #593 from bilelomrani1/main
...
Fix bitsandbytes import error when CUDA is unavailable
2023-07-14 17:47:48 -07:00
Bilel Omrani
35dbb1ff52
Fix bitsandbytes import error when CUDA is unavailable
2023-07-15 02:04:26 +02:00
Tim Dettmers
486488bccb
Bumped version.
2023-07-14 12:55:57 -07:00
Tim Dettmers
6c6e5fcb53
Added changelog entry.
2023-07-14 12:55:04 -07:00
Tim Dettmers
55f4c398a0
Polished CUDA SETUP replacement and added docs.
2023-07-14 12:50:59 -07:00
Tim Dettmers
1ab6758b36
Changed CUDA setup to use PyTorch default; added a weak test.
2023-07-13 23:58:41 -07:00
Tim Dettmers
ac155f7415
Merge branch 'main' into bugfixes
2023-07-13 21:55:35 -07:00
Tim Dettmers
e8df8d64a2
Merge pull request #375 from rapsealk/fix/libcuda-to-torch
...
Replace libcudart.so with PyTorch's CUDA APIs
2023-07-13 21:54:47 -07:00
Tim Dettmers
c00402f17e
Fixed a bug in absmax float conversion.
2023-07-13 21:47:38 -07:00
Tim Dettmers
6689afaec4
Merge pull request #567 from apbard/patch-1
...
[BugFix] replace view+continuous with reshape
2023-07-13 21:45:00 -07:00
Tim Dettmers
67475257a9
Added documentation for NF4; failing 8-bit matmul; fixed absmax bug. #529 #543
2023-07-13 21:41:43 -07:00
Tim Dettmers
8a20cd864b
Added missing scipy requirement. Addressing #544
2023-07-13 21:25:07 -07:00
Tim Dettmers
097b1cc5da
Fixed bug caused by undefined default type of absmax. #553
2023-07-13 21:23:33 -07:00
Tim Dettmers
7b6cfe1738
Added H100 support for CUDA 11.8 precompiled binaries.
2023-07-13 21:16:23 -07:00
Bram Vanroy
91c4fd844b
add public git repo URL
2023-07-14 00:51:05 +02:00
Tim Dettmers
817bdf6325
Bumped version after hotfix.
2023-07-11 17:16:05 -07:00
Tim Dettmers
90b0ac57b0
Fixed missing bias in bnb.matmul_4bit for inference; more tests.
2023-07-11 17:13:33 -07:00
Tim Dettmers
dc96e9e7c8
Test for bloom that fails with inference kernels.
2023-07-11 15:40:20 -07:00
Tim Dettmers
ae7cd6ad14
Bump version.
2023-07-11 05:58:25 -07:00
Tim Dettmers
ba51d95d43
Added more extensive gemv tests; blocksize guard for gemv.
2023-07-11 05:55:49 -07:00
Tim Dettmers
b8da4a165a
Bump on version.
2023-07-10 16:40:22 -07:00
Tim Dettmers
a26a321e07
Removed debugging statement.
2023-07-10 14:34:19 -07:00
Tim Dettmers
306f6b2362
Fixed accidential deletion of limits in kernel.
2023-07-10 14:24:33 -07:00
Tim Dettmers
2221f4cee0
Fixed potential memory leak.
2023-07-10 13:57:44 -07:00
Tim Dettmers
490153b29f
Added generation tests.
2023-07-10 12:19:16 -07:00
Tim Dettmers
1c774ecebb
Added ARCH guard for bfloat16 computations.
2023-07-10 09:53:23 -07:00
Tim Dettmers
0a1cced375
Fixed typo in cuda_install.sh.
2023-07-10 06:40:19 -07:00