Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitsandbytes just never works. #914

Closed
venturaEffect opened this issue Dec 14, 2023 · 17 comments
Closed

bitsandbytes just never works. #914

venturaEffect opened this issue Dec 14, 2023 · 17 comments
Assignees

Comments

@venturaEffect
Copy link

venturaEffect commented Dec 14, 2023

I've tried with different repos to finetune llms. Alsways when I see bitsandbytes are needed I already know this will be a pain in the ass. And this time was no difference.

Used the repo of Llama-Factory to fine tune a model. And needed bitsandbytes.

It tells there is a bug and following the instructions I write python -m bitsandbytes to see what is wrong. But, it just sends me the Bug Report from before.

`===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

bin C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll
False
C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\cuda_setup\main.py:156: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('C:/Users/zaesa/anaconda3/envs/llama_factory/bin')}
warn(msg)
C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\cuda_setup\main.py:156: UserWarning: C:\Users\zaesa\anaconda3\envs\llama_factory did not contain ['cudart64_110.dll', 'cudart64_120.dll', 'cudart64_12.dll'] as expected! Searching further paths...
warn(msg)
CUDA SETUP: CUDA runtime path found: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\cudart64_110.dll
CUDA SETUP: Highest compute capability among GPUs detected: 8.9
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll...
Could not find module 'C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll' (or one of its dependencies). Try using the full path with constructor syntax.
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone [email protected]:TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=118 make cuda11x
python setup.py install
Traceback (most recent call last):
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, Error)
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\runpy.py", line 146, in get_module_details
return get_module_details(pkg_main_name, error)
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\runpy.py", line 110, in get_module_details
import(pkg_name)
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes_init
.py", line 6, in
from . import cuda_setup, utils, research
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\research_init
.py", line 1, in
from . import nn
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\research\nn_init.py", line 1, in
from .modules import LinearFP8Mixed, LinearFP8Global
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\research\nn\modules.py", line 8, in
from bitsandbytes.optim import GlobalOptimManager
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\optim_init.py", line 6, in
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "C:\Users\zaesa\anaconda3\envs\llama_factory\lib\site-packages\bitsandbytes\cextension.py", line 20, in
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues`

Would really appreciate how to solve this. I can't imagine how many people is going through this for fine tune and use LLMs. Because doesn't matter the path I take, when there is bitsandbytes involved it just breaks things.

Appreciate any suggestion.

@iamanthos
Copy link

I am there with you. Same is happening with me.

@cbjrobertson
Copy link

Can I bump this 1000 times?

@josemerinom
Copy link

colab
===================================BUG REPORT===================================
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

warn(msg)

/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: /usr/lib64-nvidia did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')}
The following directories listed in your path were found to be non-existent: {PosixPath('http'), PosixPath('//172.28.0.1'), PosixPath('8013')}
The following directories listed in your path were found to be non-existent: {PosixPath('--logtostderr --listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-t4-s-k1i9keaqrl3p --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')}
The following directories listed in your path were found to be non-existent: {PosixPath('/datalab/web/pyright/typeshed-fallback/stdlib,/usr/local/lib/python3.10/dist-packages')}
The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')}
The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//ipykernel.pylab.backend_inline')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=118, Highest Compute Capability: 7.5.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so...
libcusparse.so.11: cannot open shared object file: No such file or directory

@Arindam75
Copy link

Arindam75 commented Dec 16, 2023

Same problem here.
I removed all the NVIDIA Drivers and CUDA related stuff. Re-installed cuda_12.0.0_527.41_windows.exe
Exactly same problem.

  RuntimeError: Failed to import transformers.trainer because of the following error (look up to see its traceback):

  CUDA Setup failed despite GPU being available. Please run the following command to get more information:

  python -m bitsandbytes

  Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
  to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
  and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

@josemerinom
Copy link

josemerinom commented Dec 16, 2023

In colab, I use:

  1. !pip install -U bitsandbytes

  2. !pip install bitsandbytes==...( version: v35.. v40 v41 )

  3. !git clone https://github.com/timdettmers/bitsandbytes.git
    %cd bitsandbytes
    !CUDA_VERSION=118 make cuda11x
    !python setup.py install

  4. !wget https://raw.githubusercontent.com/TimDettmers/bitsandbytes/main/install_cuda.sh
    !bash install_cuda.sh 118 ~/local 1

  5. https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux

  6. ...
    !apt-get install cuda...

  7. I tried version 12.2, which is installed by default

Colab has cuda version 12.2, I reinstalled version 11.8, I had both versions, I used the suggested installation in the repository and it didn't work for me

@venturaEffect
Copy link
Author

Yeah, I tried everything, installed from scratch CUDA different versions, compatibility with Pytorch and also with bitsandbytes. Also on different enviroments for different repos. Nothing, it just never works.

@josemerinom
Copy link

I have changed the patch location of the libraries, I have copied the libraries to the location that requests bit, the only file that I cannot find is libcudart.so in google colab, (libcudart.so.11 and libcudart.so.12 if I have them)

@Titus-von-Koeller
Copy link
Collaborator

Titus-von-Koeller commented Dec 16, 2023 via email

@josemerinom
Copy link

I understand the situation, last week I was training in colab with bitsandbytes (I did tests and trained 6 to 8 models every day), but one morning, I started to get the error.
Sorry if I'm only talking about collab, but it's where I work.
In colab, every time it is used, bitsandbytes must be installed, I installed the latest version.
What changed from a few days ago to today?
I don't know if there was any update to bitsandbytes, or in my case, something changed in colab.
In the error message it indicates places where you search for the libraries, perhaps one of those sites is down and producing the error. I don't know, I speak without knowing about programming.

Thank you for your work and we hope to find a solution

@Titus-von-Koeller
Copy link
Collaborator

Titus-von-Koeller commented Dec 16, 2023 via email

@josemerinom
Copy link

here I uploaded my notebook + debugging messages
It is the notebook I used to train
the only modification is:
!pip install bitsandbytes > !pip install bitsandbytes==0.41.1

https://colab.research.google.com/github/josemerinom/test/blob/master/test.ipynb

pd:
image

@bcallonnec
Copy link

bcallonnec commented Dec 20, 2023

In colab i had the same issue,

command !nvcc --version outputs -> cuda version 12.2
But command torch.version.cuda outputs -> 11.7

So i followed the recommandations of this post to install cuda 11-7

@apurtell
Copy link

apurtell commented Dec 21, 2023

#934

So i followed the recommandations of this post to install cuda 11-7

If you have Colab Pro you can get a shell, so I followed those recommendations, but nvcc --version still outputs "release 12.2" for me afterward. 🤷

@bcallonnec
Copy link

Same for me nvcc --version still outpus "release 12.2" but command python -m bitsandbytes now print out "success"

@venturaEffect
Copy link
Author

venturaEffect commented Dec 21, 2023

It keeps not working.

It doesn't matter the path I take. Always the same problem it doesn't matter what, if it involves bitsandbytes, then you are screwed.

`❌ ERROR | 2023-12-21 15:38:27 | autotrain.trainers.common:wrapper:86 - train has failed due to an exception: Traceback (most recent call last):
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\transformers\utils\import_utils.py", line 1382, in get_module
return importlib.import_module("." + module_name, self.name)
File "C:\Program Files\Python310\lib\importlib_init
.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in find_and_load
File "", line 1006, in find_and_load_unlocked
File "", line 688, in load_unlocked
File "", line 883, in exec_module
File "", line 241, in call_with_frames_removed
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\transformers\integrations\bitsandbytes.py", line 11, in
import bitsandbytes as bnb
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\bitsandbytes_init
.py", line 6, in
from . import cuda_setup, utils, research
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\bitsandbytes\research_init
.py", line 1, in
from . import nn
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\bitsandbytes\research\nn_init.py", line 1, in
from .modules import LinearFP8Mixed, LinearFP8Global
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\bitsandbytes\research\nn\modules.py", line 8, in
from bitsandbytes.optim import GlobalOptimManager
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\bitsandbytes\optim_init.py", line 6, in
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\bitsandbytes\cextension.py", line 20, in
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\autotrain\trainers\common.py", line 83, in wrapper
return func(*args, **kwargs)
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\autotrain\trainers\clm_main_.py", line 157, in train
model = AutoModelForCausalLM.from_pretrained(
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\transformers\models\auto\auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\transformers\modeling_utils.py", line 3464, in from_pretrained
from .integrations import get_keys_to_not_convert, replace_with_bnb_linear
File "", line 1075, in _handle_fromlist
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\transformers\utils\import_utils.py", line 1372, in getattr
module = self._get_module(self._class_to_module[name])
File "C:\Users\zaesa\AppData\Roaming\Python\Python310\site-packages\transformers\utils\import_utils.py", line 1384, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):

    CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

❌ ERROR | 2023-12-21 15:38:27 | autotrain.trainers.common:wrapper:87 - Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):

    CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues

`

@josemerinom
Copy link

josemerinom commented Dec 28, 2023

I am using google colab (cuda 12.2 default)
I installed cuda 11.8 (because I use torch==2.0.1+cu118...)
I added the location of cuda 11.8 to the path
and it worked bitsandbytes

CODE:

!apt-get update
!apt-get install cuda-toolkit-11-8
import os
os.environ["LD_LIBRARY_PATH"] += ":" + "/usr/local/cuda-11/lib64"
os.environ["LD_LIBRARY_PATH"] += ":" + "/usr/local/cuda-11.8/lib64"

@Titus-von-Koeller
Copy link
Collaborator

Hey everyone, the main issue (which started this thread) is actually clearly an installation on Windows, which wasn't supported before now but could be wrongly installed because the package build was setup wrong to support any OS. This has now been fixed and official Windows support added with the latest release from last week.

Since the issue creator was on Windows, I'm therefore closing this issue. For the others, I'm sorry, but since we took over maintenance we're working tirelessly to get on top of the maintenance backlog and improve bitsandbytes in all kinds of respect, also trying to better handle the long tail of people that are still experiencing issues with installation (it does work for most). Often this is related to the underlying installation or people are on CUDA versions or GPUs that are not supported and would need to compile from source.

Anyways, we're currently working hard to improve the setup and diagnostics in bitsandbytes and you should see major improvements there in the next two releases. For now, I can't look in detail into each of your questions, I'm really sorry. I need to focus my energies to improve the overall situation in BNB and there's endless work right now.

For those on Windows, please reinstall the latest bitsandbytes version and try again. If you're still having issues, post them in this new umbrella / catch-all issue.

For those on Linux, if you're still having issues, please open a separate issue and be extra careful to read through the debug output in detail and make sure to not cross-post on issue that only look superficially similar, because the debug output often looks too similar.

For other platforms, these are currently not officially supported and on versions prior to v0.43.0 BNB could wrongly be installed on them via pip. This was a bug / wrong implementation. Now this is not possible anymore. Sorry for the time some of you have lost on this!


Dear all,

Since the current release (last week, 8th of March) we now have official support for Windows 🎉 (which we did not have before) via

pip install bitsandbytes>=0.43.0

We're closing all old Windows issues and are asking everyone to try installing with this new version as outlined above and validate the install with python -m bitsandbytes which should spit out a bunch of stuff and then SUCCESS. Please let us know if everything worked correctly in this new umbrella / catch-all issue. Thanks 🤗

@matthewdouglas matthewdouglas added RFC request for comments on proposed library improvements and removed RFC request for comments on proposed library improvements labels Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants