ValueError: numpy.ndarray size changed when calling import hdbscan #39

jo-mueller · 2022-01-26T16:27:50Z

I recently installed the clusters plotter and ran into an issue when running hdbscan. When I run the clustering on a set of measurements (e.g., based on blobs.gif), I receive this error:

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Here's a screenshot of the setup

and the complete traceback:

File e:\biapol\projects\napari-clusters-plotter\napari_clusters_plotter\_clustering.py:274, in ClusteringWidget.run(self=<napari_clusters_plotter._clustering.ClusteringWidget object>, labels_layer=<Labels layer 'labels'>, selected_measurements_list=['min_intensity', 'max_intensity', 'sum_intensity', 'area', 'mean_intensity', 'centroid_x', 'centroid_y', 'centroid_z', 'mean_distance_to_centroid', 'standard_deviation_intensity', 'max_distance_to_centroid', 'mean_max_distance_to_centroid_ratio'], selected_method='HDBSCAN', num_clusters=2, num_iterations=300, standardize=False, min_cluster_size=5, min_nr_samples=5)
    271     add_column_to_layer_tabular_data(labels_layer, "KMEANS_CLUSTER_ID_SCALER_" + str(standardize), y_pred)
    273 elif selected_method == "HDBSCAN":
--> 274     y_pred = hdbscan_clustering(standardize, selected_properties, min_cluster_size, min_nr_samples)
        selected_properties =     min_intensity  max_intensity  ...  max_distance_to_centroid  mean_max_distance_to_centroid_ratio
0           152.0          232.0  ...                 19.075121                             2.262548
1           152.0          224.0  ...                  9.869590                             1.911683
2           152.0          248.0  ...                 16.878197                             1.795291
3           152.0          248.0  ...                 12.767069                             1.674100
4           152.0          248.0  ...                 15.421144                             1.845683
..            ...            ...  ...                       ...                                  ...
57          152.0          224.0  ...                  8.849086                             1.716192
58          152.0          216.0  ...                 10.764353                             2.296404
59          152.0          248.0  ...                  8.658107                             2.178612
60          152.0          248.0  ...                  6.827476                             2.224879
61          152.0          224.0  ...                  7.477417                             2.171838

[62 rows x 12 columns]
        standardize = False
        min_cluster_size = 5
        min_nr_samples = 5
    275     print("HDBSCAN predictions finished.")
    276     # write result back to features/properties of the labels layer

File e:\biapol\projects\napari-clusters-plotter\napari_clusters_plotter\_clustering.py:305, in hdbscan_clustering(standardize=False, measurements=    min_intensity  max_intensity  ...  max_dista...                 2.171838

[62 rows x 12 columns], min_cluster_size=5, min_samples=5)
    304 def hdbscan_clustering(standardize, measurements, min_cluster_size, min_samples):
--> 305     import hdbscan
    306     print("HDBSCAN predictions started (standardize: " + str(standardize) + ")...")
    308     clustering_hdbscan = hdbscan.HDBSCAN(min_cluster_size=min_cluster_size, min_samples=min_samples)

File ~\anaconda3\envs\napari_clusters\lib\site-packages\hdbscan\__init__.py:1, in <module>
----> 1 from .hdbscan_ import HDBSCAN, hdbscan
      2 from .robust_single_linkage_ import RobustSingleLinkage, robust_single_linkage
      3 from .validity import validity_index

File ~\anaconda3\envs\napari_clusters\lib\site-packages\hdbscan\hdbscan_.py:21, in <module>
     17 from joblib.parallel import cpu_count
     19 from scipy.sparse import csgraph
---> 21 from ._hdbscan_linkage import (single_linkage,
     22                                mst_linkage_core,
     23                                mst_linkage_core_vector,
     24                                label)
     25 from ._hdbscan_tree import (condense_tree,
     26                             compute_stability,
     27                             get_clusters,
     28                             outlier_scores)
     29 from ._hdbscan_reachability import (mutual_reachability,
     30                                     sparse_mutual_reachability)

File hdbscan/_hdbscan_linkage.pyx:1, in init hdbscan._hdbscan_linkage()

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

I looked up the error and it probably boils down to this issue. I tried a few of the suggestions from this thread (e.g., rolling numpy back to 1.20.5), but couldn't solve the issue. I currently have numpy 1.21.5 installed.

Edit: It seems that will be solved once the new release of hdbscan is out (apparently it works when installing directly from hdbscan master branch), but I haven't verified this.

The text was updated successfully, but these errors were encountered:

lazigu · 2022-01-27T12:03:17Z

Hi, Johannes @jo-mueller, thanks for reporting! So I just tried to reproduce the error on two laptops but I wasn't successful, the plugin works fine on both machines. But with one laptop, which is quite old, it's a different story since I am creating there an environment with a lower python version (3.8), and lower pyopencl version (2020.1). While with my laptop I always need to install hdbscan via conda prior to installing the plugin, because it always fails to build wheels for hdbscan. I see installing hdbscan via conda was also mentioned as a solution in the issue you linked, which might be why I am not seeing this error. Have you tried installing hdbscan that way?
These are the exact steps I do in case it might be helpful:
conda create --name ncp-env python=3.9
conda install -c conda-forge pyopencl
python -m pip install "napari[all]"
conda install -c conda-forge hdbscan
pip install napari-clusters-plotter

I also have numpy 1.21.5 installed in the environment, hdbscan=0.8.27, numba=0.55.0

jo-mueller · 2022-02-01T08:45:33Z

Hi Laura @lazigu ,

thanks for the quick fix, this solves it :) I'll make a small PR to add this to the troubleshooting.

jo-mueller added the bug Something isn't working label Jan 26, 2022

jo-mueller mentioned this issue Feb 1, 2022

Update README.md #41

Merged

lazigu closed this as completed in #41 Feb 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: numpy.ndarray size changed when calling import hdbscan #39

ValueError: numpy.ndarray size changed when calling import hdbscan #39

jo-mueller commented Jan 26, 2022 •

edited

Loading

lazigu commented Jan 27, 2022

jo-mueller commented Feb 1, 2022

ValueError: numpy.ndarray size changed when calling import hdbscan #39

ValueError: numpy.ndarray size changed when calling import hdbscan #39

Comments

jo-mueller commented Jan 26, 2022 • edited Loading

lazigu commented Jan 27, 2022

jo-mueller commented Feb 1, 2022

jo-mueller commented Jan 26, 2022 •

edited

Loading