arlo-phoenix
|
d10197bc93
|
Add HIP to cuda defines
collected by hipifying all files and then comparing with original
Cuda file
|
2023-08-05 02:11:46 +02:00 |
|
Tim Dettmers
|
18e827d666
|
Version 0.41.1.
|
2023-08-03 20:01:10 -07:00 |
|
Tim Dettmers
|
3c9aca9124
|
Fixed two bugs in dynamic data type creation.
|
2023-08-03 19:47:15 -07:00 |
|
Tim Dettmers
|
a06a0f6a08
|
Bumped version for new release.
|
2023-07-22 13:07:08 -07:00 |
|
Tim Dettmers
|
412fd0e717
|
Added better default compute_dtype handling for Linear4bit layers.
|
2023-07-22 12:56:29 -07:00 |
|
Tim Dettmers
|
c82f51c0f7
|
Increased occupancy.
|
2023-07-19 16:08:37 -07:00 |
|
Tim Dettmers
|
e229fbce66
|
Added latest changes.
|
2023-07-16 21:23:57 -07:00 |
|
Tim Dettmers
|
7be5f2c7b3
|
Guard for prefetchAsync GPU capability. #470 #451 #477
|
2023-07-16 21:12:03 -07:00 |
|
Tim Dettmers
|
f3232d1391
|
Fixed bug where read-permission was assumed for a file. #497
|
2023-07-16 21:08:13 -07:00 |
|
Tim Dettmers
|
37c25c1e0d
|
Merge branch 'main' of github.com:TimDettmers/bitsandbytes into main
|
2023-07-15 10:22:45 -07:00 |
|
Tim Dettmers
|
f4996978db
|
Added missing check if LD_LIBRARY_PATH exists. #588
|
2023-07-15 10:22:08 -07:00 |
|
Tim Dettmers
|
6102029ab9
|
Merge pull request #587 from BramVanroy/patch-1
replace private with public https repo URL
|
2023-07-15 10:04:34 -07:00 |
|
Tim Dettmers
|
67a3cdf652
|
Merge pull request #595 from ihsanturk/FIX-__main__.py-REFERENCE-TO-NONEXISTENT-get_cuda_lib_handle
Fix import crash caused by __main__.py reference to nonexistent cuda_setup.main.get_cuda_lib_handle
|
2023-07-15 10:04:15 -07:00 |
|
ihsanturk
|
ce126d462d
|
deleted references to get_cuda_lib_handle
|
2023-07-15 02:49:57 -07:00 |
|
ihsanturk
|
2f0f0e5dba
|
get_cuda_lib_handle brought back so import works
|
2023-07-15 02:24:46 -07:00 |
|
Tim Dettmers
|
6ec4f0c374
|
Changed CUDA_INSTALL variable to BNB_CUDA_INSTALL.
|
2023-07-14 18:16:45 -07:00 |
|
Tim Dettmers
|
8cdec888b1
|
Merge pull request #593 from bilelomrani1/main
Fix bitsandbytes import error when CUDA is unavailable
|
2023-07-14 17:47:48 -07:00 |
|
Bilel Omrani
|
35dbb1ff52
|
Fix bitsandbytes import error when CUDA is unavailable
|
2023-07-15 02:04:26 +02:00 |
|
Tim Dettmers
|
486488bccb
|
Bumped version.
|
2023-07-14 12:55:57 -07:00 |
|
Tim Dettmers
|
6c6e5fcb53
|
Added changelog entry.
|
2023-07-14 12:55:04 -07:00 |
|
Tim Dettmers
|
55f4c398a0
|
Polished CUDA SETUP replacement and added docs.
|
2023-07-14 12:50:59 -07:00 |
|
Tim Dettmers
|
1ab6758b36
|
Changed CUDA setup to use PyTorch default; added a weak test.
|
2023-07-13 23:58:41 -07:00 |
|
Tim Dettmers
|
ac155f7415
|
Merge branch 'main' into bugfixes
|
2023-07-13 21:55:35 -07:00 |
|
Tim Dettmers
|
e8df8d64a2
|
Merge pull request #375 from rapsealk/fix/libcuda-to-torch
Replace libcudart.so with PyTorch's CUDA APIs
|
2023-07-13 21:54:47 -07:00 |
|
Tim Dettmers
|
c00402f17e
|
Fixed a bug in absmax float conversion.
|
2023-07-13 21:47:38 -07:00 |
|
Tim Dettmers
|
6689afaec4
|
Merge pull request #567 from apbard/patch-1
[BugFix] replace view+continuous with reshape
|
2023-07-13 21:45:00 -07:00 |
|
Tim Dettmers
|
67475257a9
|
Added documentation for NF4; failing 8-bit matmul; fixed absmax bug. #529 #543
|
2023-07-13 21:41:43 -07:00 |
|
Tim Dettmers
|
8a20cd864b
|
Added missing scipy requirement. Addressing #544
|
2023-07-13 21:25:07 -07:00 |
|
Tim Dettmers
|
097b1cc5da
|
Fixed bug caused by undefined default type of absmax. #553
|
2023-07-13 21:23:33 -07:00 |
|
Tim Dettmers
|
7b6cfe1738
|
Added H100 support for CUDA 11.8 precompiled binaries.
|
2023-07-13 21:16:23 -07:00 |
|
Bram Vanroy
|
91c4fd844b
|
add public git repo URL
|
2023-07-14 00:51:05 +02:00 |
|
Tim Dettmers
|
817bdf6325
|
Bumped version after hotfix.
|
2023-07-11 17:16:05 -07:00 |
|
Tim Dettmers
|
90b0ac57b0
|
Fixed missing bias in bnb.matmul_4bit for inference; more tests.
|
2023-07-11 17:13:33 -07:00 |
|
Tim Dettmers
|
dc96e9e7c8
|
Test for bloom that fails with inference kernels.
|
2023-07-11 15:40:20 -07:00 |
|
Tim Dettmers
|
ae7cd6ad14
|
Bump version.
|
2023-07-11 05:58:25 -07:00 |
|
Tim Dettmers
|
ba51d95d43
|
Added more extensive gemv tests; blocksize guard for gemv.
|
2023-07-11 05:55:49 -07:00 |
|
Tim Dettmers
|
b8da4a165a
|
Bump on version.
|
2023-07-10 16:40:22 -07:00 |
|
Tim Dettmers
|
a26a321e07
|
Removed debugging statement.
|
2023-07-10 14:34:19 -07:00 |
|
Tim Dettmers
|
306f6b2362
|
Fixed accidential deletion of limits in kernel.
|
2023-07-10 14:24:33 -07:00 |
|
Tim Dettmers
|
2221f4cee0
|
Fixed potential memory leak.
|
2023-07-10 13:57:44 -07:00 |
|
Tim Dettmers
|
490153b29f
|
Added generation tests.
|
2023-07-10 12:19:16 -07:00 |
|
Tim Dettmers
|
1c774ecebb
|
Added ARCH guard for bfloat16 computations.
|
2023-07-10 09:53:23 -07:00 |
|
Tim Dettmers
|
0a1cced375
|
Fixed typo in cuda_install.sh.
|
2023-07-10 06:40:19 -07:00 |
|
Tim Dettmers
|
0d344b70ba
|
Changelog and version bump.
|
2023-07-10 06:38:57 -07:00 |
|
Tim Dettmers
|
73aa4e0a33
|
Fixed Makefile and added CUDA 12.2 install.
|
2023-07-10 06:34:04 -07:00 |
|
Tim Dettmers
|
5f492d437e
|
Merge remote-tracking branch 'origin/inference'
|
2023-07-10 06:24:24 -07:00 |
|
Tim Dettmers
|
196d6f5dc1
|
Merge pull request #469 from shadeMe/linear-layer-device
Add `device` parameter to `Linear` subclasses and `Embedding`
|
2023-07-10 06:17:13 -07:00 |
|
Tim Dettmers
|
5fab673442
|
Added fp32 compute type for gemv_4bit.
|
2023-07-09 21:06:01 -07:00 |
|
Tim Dettmers
|
cef519c89e
|
Added test for Param4bit.to() and fixed double quant behavior.
|
2023-07-09 17:16:50 -07:00 |
|
Tim Dettmers
|
6a905be5ce
|
Fixed a bug where gemv_4bit would return a wrongly sized tensor.
|
2023-07-09 15:34:02 -07:00 |
|