-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: numpy.ndarray size changed when calling import hdbscan #457
Comments
Having this same exact issue as of Yesterday on Python3.8 any Numpy Version in the past Year |
Also having this issue. Tried Numpy version 1.20 and 1.16.1 |
The same with |
I fixed it by installing the package with with pip install adding the flags I honestly have no idea why this is happening, in addition to other packages I use - perhaps someone re-compiled Cython scripts with and didn't make a changelog. I'm literally shooting completely blind here though. |
Reompile also worked for me. Using public cloud that messes with compilation. |
But does anyone know WHY this is actually happening? Especially on different projects as well outside of this repo? |
@omarsumadi can you explain me how to do that? I put the |
@paulthemagno Take a look at this stack overflow post: https://stackoverflow.com/questions/40845304/runtimewarning-numpy-dtype-size-changed-may-indicate-binary-incompatibility Realistically, the only thing you would change would be: If that doesn't work, I'm not sure. Try not setting a version of Numpy to install and letting Pip reconcile which Numpy should be installed if you are using multiple packages that rely on Numpy. Perhaps your issue is a bit deeper. The way to actually solve all this though is to figure out why this happened in the first place. |
I use another package https://github.com/ing-bank/sparse_dot_topn with cython and numpy. And from today/yesterday, I got exactly the same error My enviroment is
|
@ymwdalex That's actually the same package I came to this thread for. I don't have hbdscan installed, but came to help because I was trying to solve the sparse_dot_topn package issue. To you, do you know why this is happening? I really don't want to have another go at fixing this bug again and having no idea where to start. We could start by asking them. Or maybe scipy (a dependcy of both) decided to re-compile it's wheels to a different version of Numpy and everything broke? |
@omarsumadi thanks for the comments. I am the author of sparse_dot_topn. I didn't change the source code recently and have no idea why this happening... |
@ymwdalex Ok - that is kind of funny lol! By the way, hi! I love you work and everything that you have done the library is truly one of a kind and I have not found anything that comes close to its capabilities, which is sort of why I have a vested interest in seeing this through. I'll spill to you wat I could figure out:
Again, this kind of thing is way outside of my comfort zone (I know nothing about Cython and Numpy cross-over), but perhaps we could find the version of Numpy that was used to compile the wheels and pin that as the version for your library? Sorry if some of this doesn't make much sense. |
I eventually installed python 3.7.6 and everything worked. However, I have another machine with 3.7.9 where everything works fine. So its not related to python version I think.. |
@doctor3030 I'm not sure if you should close this, not until there's some better solution to other people's problems. I don't want to tell you how to do things and I most definitely respect your contributions, but I'd imagine this is definitely NOT solved especially since its pulling cross-package discussion. I think there's a lot of cross interest figuring out what exactly happened as well. Unfortunately, I'm not well versed enough in Cython and Numpy internals to offer the correct solution other than to rebuild the wheels. Thanks, |
Ok, lets keep it open. |
Here's what I can say, apparently someone says Numpy 1.20.0 (probably what Scipy is compiled in due to some change that is now impacting all of us) according to the above (Trusted-AI/adversarial-robustness-toolbox#87). What is most likely happening among us that is that we are using packages that limit Numpy installation version to something below 1.20.0 (such as Tensorflow). Perhaps someone could verify the pull I linked? |
I have this issue when trying to use Top2Vec on Python 3.7.9, which pulls in Tensorflow and locks me to Numpy 1.19. Rebuilding HDBScan from source in turn fails on this Accelerate error, so I think I have to rebuild NumPy from source with OpenBLAS (although NumPy is otherwise working fine), which in turn is proving difficult. So this is still very much an issue for me, no doubt for some others too. |
Hello guys, we're facing the same issue here since this last weekend with no changes on the code or any library versions. Isolating it to check what could have been happening Dockerfile FROM python:3.7-slim-buster
RUN apt-get update \
&& apt-get install -y --no-install-recommends python3.7-dev=3.7.3-2+deb10u2 build-essential=12.6 jq=1.5+dfsg-2+b1 curl=7.64.0-4+deb10u1 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& pip install --upgrade pip
COPY . .
RUN python -m pip install --user -r requirements.txt
CMD ["python", "-m", "test.py"]
requirements.txt
test.py import hdbscan
print("hello") outputs $ docker run 9523faa77267 python test.py
Traceback (most recent call last):
File "test.py", line 1, in <module>
import hdbscan
File "/home/someuser/.local/lib/python3.7/site-packages/hdbscan/__init__.py", line 1, in <module>
from .hdbscan_ import HDBSCAN, hdbscan
File "/home/someuser/.local/lib/python3.7/site-packages/hdbscan/hdbscan_.py", line 21, in <module>
from ._hdbscan_linkage import (single_linkage,
File "hdbscan/_hdbscan_linkage.pyx", line 1, in init hdbscan._hdbscan_linkage
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
It works with The point is, as mentioned here before, we use I'm new on the python/pypi world, but I assumed that built wheels couldn't be updated (recompiled with updated libraries/dependencies) and if a updated was needed, a new release would be drafted with a minor change. Is there anything else we can help with? I couldn't get exactly which lib was recompiled (hdbscan or scipy?) but noticed a difference on the checksum/size for the hdbscan on different builds but not sure it's related.
|
@omarsumadi Thanks a lot for your investigation. I also open an issue in sparse_dot_topn package to refer this issue. numpy 1.20.0 works for me. In my environment which has problem, I installed numpy==1.19 first, then install sport_dot_topn, which use the latest cython and scipy (https://github.com/ing-bank/sparse_dot_topn/blob/master/setup.py#L70). Probably the latest cython or scipy has some update with incompatible with numpy version before 1.20. |
Make sure that you use correct and compatible version of libs . annoy==1.17.0 |
@ymwdalex |
I admit that I am as much at a loss as everyone else here. In fact I have little understanding of the binary wheel infrastructure on PyPI. I have not provided any new packages or wheels for hdbscan recently (i.e. within the last many months), so if there is a change it was handled by some automated process. Compiling from source (and, in fact, re-cythonizing everything) is likely the best option, but that does not leave a great install option. Any assistance from anyone with more experience in packaging than me would be greatly appreciated. |
@sgbaird If |
@swang423 thank you! This did the trick to get my GitHub actions, |
I have the same issue. And :( |
I face the same issue. I tired installing using
but for some wierd reason when I install these packages using |
@sgbaird @swang423 @MaartenGr Thanks for sharing all your inputs! This seems to have done for me also
|
I also have this problem. |
The issue turned out to be a fair bit less complex than I had thought 😅 The pypi release does not have yet the @lmcinnes Sorry to tag you like this but it seems that the issue should be solved whenever a new pip version is released. Fortunately, this also means that after that release we will not likely see this issue popping up anymore. |
I faced the same issue while working on an anaconda. Then I came out from the conda environment and created a simple venv with python 3.9.7. installed hdbscan using pip , generated the requirements file. I created a fresh conda env and installed hdbscan with the required file. I am able to use it now. pelog39) u1@ubuntu: for hdbscan to work with pytorch : |
In my env, numpy==1.21.5 works |
Pip installs the deps as specified in the requirements file. When the cython modules are built, however, pip installs the latest version of numpy, ignoring the specified version, and then builds against it. This creates invalid shared objects that can't be used. This would not be an issue normally but for a confluance of circumstances that expose the bug in pip. These are: Numpy recently changed the C API and Numba is incompatable with numpy > 1.21. Below are reference links. pypa/pip#9542 scikit-learn-contrib/hdbscan#457 (comment) https://stackoverflow.com/questions/66060487/valueerror-numpy-ndarray-size-changed-may-indicate-binary-incompatibility-exp
Setuptools needs to use the proper version of numpy (which now must be >1.20) while building the project, or we'll get C errors on import. For more details, see: - scikit-learn-contrib/hdbscan#457 (comment) - https://stackoverflow.com/questions/66060487/valueerror-numpy-ndarray-size-changed-may-indicate-binary-incompatibility-exp
* First pass at versioneer -> setuptools_scm * Fix thorny numpy/Cython build requirement issues Setuptools needs to use the proper version of numpy (which now must be >1.20) while building the project, or we'll get C errors on import. For more details, see: - scikit-learn-contrib/hdbscan#457 (comment) - https://stackoverflow.com/questions/66060487/valueerror-numpy-ndarray-size-changed-may-indicate-binary-incompatibility-exp * Try adding pyfftw to build requirements * update documentation * Remove extra versioneer cruft * Attempt to fix docs / python <3.8 compatibility * Try bumping doc dependencies and explicitly requiring pyfftw * Use pyfftw environment in CI * Explicitly include importlib_metadata for compatibility older versions of python 3.x * Update installation instructions to reflect pyfftw changes
This worked for me. Below is my env.yml (not in complete) (I had issue numba as well as someone mentioned above). Everything got fixed with below versions
|
* Updated the numpy version. * Synced the pandas version. In Python 3.10, if you invoke `pip install pandas~=1.1.4 numpy~=1.21.4` and then `import pandas` you get the following error: ``` >>> import pandas Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/venv/lib/python3.10/site-packages/pandas/__init__.py", line 30, in <module> from pandas._libs import hashtable as _hashtable, lib as _lib, tslib as _tslib File "/tmp/venv/lib/python3.10/site-packages/pandas/_libs/__init__.py", line 13, in <module> from pandas._libs.interval import Interval File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject ``` I believe that this is the cause of the issue scikit-learn-contrib/hdbscan#457 (comment) PiperOrigin-RevId: 467785781
* Updated the numpy version. * Synced the pandas version. In Python 3.10, if you invoke `pip install pandas~=1.1.4 numpy~=1.21.4` and then `import pandas` you get the following error: ``` >>> import pandas Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/tmp/venv/lib/python3.10/site-packages/pandas/__init__.py", line 30, in <module> from pandas._libs import hashtable as _hashtable, lib as _lib, tslib as _tslib File "/tmp/venv/lib/python3.10/site-packages/pandas/_libs/__init__.py", line 13, in <module> from pandas._libs.interval import Interval File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject ``` I believe that this is the cause of the issue scikit-learn-contrib/hdbscan#457 (comment) PiperOrigin-RevId: 467952859
Downgrading to a suitable hdbscan version has helped me here. Use trial and error to find the appropriate version. |
When I try to import hdbscan I get following error:
`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
1 from sklearn.decomposition import PCA
2 import umap
----> 3 import hdbscan
4 from hyperopt import fmin, tpe, atpe, rand, hp, STATUS_OK, Trials, SparkTrials
5 import pickle
c:\program files\python37\lib\site-packages\hdbscan_init_.py in
----> 1 from .hdbscan_ import HDBSCAN, hdbscan
2 from .robust_single_linkage_ import RobustSingleLinkage, robust_single_linkage
3 from .validity import validity_index
4 from .prediction import approximate_predict, membership_vector, all_points_membership_vectors
5
c:\program files\python37\lib\site-packages\hdbscan\hdbscan_.py in
19 from scipy.sparse import csgraph
20
---> 21 from ._hdbscan_linkage import (single_linkage,
22 mst_linkage_core,
23 mst_linkage_core_vector,
hdbscan_hdbscan_linkage.pyx in init hdbscan._hdbscan_linkage()
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject`
I use:
python 3.7.9
numpy 1.19.3 (I also tried 1.19.5)
I would appreciate your help.
The text was updated successfully, but these errors were encountered: