-
Notifications
You must be signed in to change notification settings - Fork 666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bitsandbytes just never works. #914
Comments
I am there with you. Same is happening with me. |
Can I bump this 1000 times? |
colab python -m bitsandbytes warn(msg)/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: /usr/lib64-nvidia did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths... |
Same problem here.
|
In colab, I use:
Colab has cuda version 12.2, I reinstalled version 11.8, I had both versions, I used the suggested installation in the repository and it didn't work for me |
Yeah, I tried everything, installed from scratch CUDA different versions, compatibility with Pytorch and also with bitsandbytes. Also on different enviroments for different repos. Nothing, it just never works. |
I have changed the patch location of the libraries, I have copied the libraries to the location that requests bit, the only file that I cannot find is libcudart.so in google colab, (libcudart.so.11 and libcudart.so.12 if I have them) |
Dear all,
Sorry that you all are having these hard to solve issues. Unfortunately, a
few people keep having these issues, partly to external reasons but I agree
it super frustrating. The majority of downloads (millions) seem to be
working, luckily enough, but let's try to figure this out together and try
to improve the situation going
forward.
I'll look more into this next week, but I would be very interested if this
proposed fix changes your situation at all:
#915 (comment)
Another approach to debug dynamic linking issues would be to run `ldd` on
the respective bnb binary and see if it's linked with the correct
dependencies.
Unfortunately these things are tricky because through pip we can only
control the Python part of the equation and package binaries, but there are
a bunch of dependencies that come from the system libraries and CUDA
install libraries, that have to be detected and linked together correctly.
The situation is a bit complex unfortunately and only partially under bnb
control.
The trace that you're all sharing is our attempt of helping in detecting
and guiding to resolve these issues. Obviously, there's still potential for
improvement that's why I just opened an improvement proposal RFC on the
topic.
The situation becomes even trickier with Windows, which is only has limited
(unofficial) support in bnb.
Sorry again for your struggles. We re working hard to get on top of
maintenance issues, as this library has only recently found sponsorship and
has a big backlog.
Thanks also for any info that you're providing to help along with
improvements.
Best,
Titus
…On Sat, Dec 16, 2023, 16:37 Jose Esteban Merino ***@***.***> wrote:
I have changed the patch location of the libraries, I have copied the
libraries to the location that requests bit, the only file that I cannot
find is libcudart.so in google colab, (libcudart.so.11 and libcudart.so.12
if I have them)
—
Reply to this email directly, view it on GitHub
<#914 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACFBEOZOHIK56USDH2PRU2LYJW53DAVCNFSM6AAAAABAU4BEKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJYHA2DKNRVGQ>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
I understand the situation, last week I was training in colab with bitsandbytes (I did tests and trained 6 to 8 models every day), but one morning, I started to get the error. Thank you for your work and we hope to find a solution |
Hey! Thanks for getting back to me. Which type of instance are you using,
is it with a T4 GPU? Can you share the notebook with me, so I can take a
look + debug?
Is it now always not working or only some of the time?
…On Sat, Dec 16, 2023, 17:36 Jose Esteban Merino ***@***.***> wrote:
I understand the situation, last week I was training in colab with
bitsandbytes (I did tests and trained 6 to 8 models every day), but one
morning, I started to get the error.
Sorry if I'm only talking about collab, but it's where I work.
In colab, every time it is used, bitsandbytes must be installed, I
installed the latest version.
What changed from a few days ago to today?
I don't know if there was any update to bitsandbytes, or in my case,
something changed in colab.
In the error message it indicates places where you search for the
libraries, perhaps one of those sites is down and producing the error. I
don't know, I speak without knowing about programming.
Thank you for your work and we hope to find a solution
—
Reply to this email directly, view it on GitHub
<#914 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACFBEO7TTTQINUDK7IFQWCTYJXEZBAVCNFSM6AAAAABAU4BEKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJYHA3DENJWGU>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
here I uploaded my notebook + debugging messages https://colab.research.google.com/github/josemerinom/test/blob/master/test.ipynb |
In colab i had the same issue, command So i followed the recommandations of this post to install cuda 11-7 |
Same for me |
It keeps not working. It doesn't matter the path I take. Always the same problem it doesn't matter what, if it involves bitsandbytes, then you are screwed. `❌ ERROR | 2023-12-21 15:38:27 | autotrain.trainers.common:wrapper:86 - train has failed due to an exception: Traceback (most recent call last):
The above exception was the direct cause of the following exception: Traceback (most recent call last):
❌ ERROR | 2023-12-21 15:38:27 | autotrain.trainers.common:wrapper:87 - Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):
` |
I am using google colab (cuda 12.2 default) CODE: !apt-get update |
Hey everyone, the main issue (which started this thread) is actually clearly an installation on Windows, which wasn't supported before now but could be wrongly installed because the package build was setup wrong to support Since the issue creator was on Windows, I'm therefore closing this issue. For the others, I'm sorry, but since we took over maintenance we're working tirelessly to get on top of the maintenance backlog and improve bitsandbytes in all kinds of respect, also trying to better handle the long tail of people that are still experiencing issues with installation (it does work for most). Often this is related to the underlying installation or people are on CUDA versions or GPUs that are not supported and would need to compile from source. Anyways, we're currently working hard to improve the setup and diagnostics in bitsandbytes and you should see major improvements there in the next two releases. For now, I can't look in detail into each of your questions, I'm really sorry. I need to focus my energies to improve the overall situation in BNB and there's endless work right now. For those on Windows, please reinstall the latest bitsandbytes version and try again. If you're still having issues, post them in this new umbrella / catch-all issue. For those on Linux, if you're still having issues, please open a separate issue and be extra careful to read through the debug output in detail and make sure to not cross-post on issue that only look superficially similar, because the debug output often looks too similar. For other platforms, these are currently not officially supported and on versions prior to Dear all, Since the current release (last week, 8th of March) we now have official support for Windows 🎉 (which we did not have before) via
We're closing all old Windows issues and are asking everyone to try installing with this new version as outlined above and validate the install with |
I've tried with different repos to finetune llms. Alsways when I see bitsandbytes are needed I already know this will be a pain in the ass. And this time was no difference.
Used the repo of Llama-Factory to fine tune a model. And needed bitsandbytes.
It tells there is a bug and following the instructions I write
python -m bitsandbytes
to see what is wrong. But, it just sends me the Bug Report from before.`===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
bin C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll
False
C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\cuda_setup\main.py:156: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('C:/Users/zaesa/anaconda3/envs/llama_factory/bin')}
warn(msg)
C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\cuda_setup\main.py:156: UserWarning: C:\Users\zaesa\anaconda3\envs\llama_factory did not contain ['cudart64_110.dll', 'cudart64_120.dll', 'cudart64_12.dll'] as expected! Searching further paths...
warn(msg)
CUDA SETUP: CUDA runtime path found: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\cudart64_110.dll
CUDA SETUP: Highest compute capability among GPUs detected: 8.9
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll...
Could not find module 'C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll' (or one of its dependencies). Try using the full path with constructor syntax.
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone [email protected]:TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=118 make cuda11x
python setup.py install
Traceback (most recent call last):
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, Error)
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\runpy.py", line 146, in get_module_details
return get_module_details(pkg_main_name, error)
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\runpy.py", line 110, in get_module_details
import(pkg_name)
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes_init.py", line 6, in
from . import cuda_setup, utils, research
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\research_init.py", line 1, in
from . import nn
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\research\nn_init.py", line 1, in
from .modules import LinearFP8Mixed, LinearFP8Global
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\research\nn\modules.py", line 8, in
from bitsandbytes.optim import GlobalOptimManager
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\optim_init.py", line 6, in
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\cextension.py", line 20, in
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:
Would really appreciate how to solve this. I can't imagine how many people is going through this for fine tune and use LLMs. Because doesn't matter the path I take, when there is bitsandbytes involved it just breaks things.
Appreciate any suggestion.
The text was updated successfully, but these errors were encountered: