Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal instruction when importing numpy #60

Closed
ChristopherHogan opened this issue Feb 13, 2019 · 15 comments
Closed

Illegal instruction when importing numpy #60

ChristopherHogan opened this issue Feb 13, 2019 · 15 comments

Comments

@ChristopherHogan
Copy link

Issue:
Creating an environment with the latest numpy leads to a core dump on import:

$ conda create -n np -c conda-forge numpy
$ conda activate np
$ python -c 'import numpy'
Illegal instruction (core dumped)

Running it through gdb shows

Thread 1 "python" received signal SIGILL, Illegal instruction.
0x00007ffff5d16fd8 in sdot_k_SKYLAKEX ()
   from /home/chris/miniconda3/envs/np/lib/python3.7/site-packages/numpy/core/../../../../libopenblas.so.0

I'm running Ubuntu 16.04 in a VirtualBox VM on a Windows 10 host, with an Intel i7-7820X.


Environment (conda list):

$ conda list
# packages in environment at /home/chris/miniconda3/envs/np:
#
# Name                    Version                   Build  Channel
blas                      1.1                    openblas    conda-forge
bzip2                     1.0.6             h14c3975_1002    conda-forge
ca-certificates           2018.11.29           ha4d7672_0    conda-forge
certifi                   2018.11.29            py37_1000    conda-forge
libffi                    3.2.1             hf484d3e_1005    conda-forge
libgcc-ng                 7.3.0                hdf63c60_0    conda-forge
libgfortran-ng            7.2.0                hdf63c60_3    conda-forge
libstdcxx-ng              7.3.0                hdf63c60_0    conda-forge
ncurses                   6.1               hf484d3e_1002    conda-forge
numpy                     1.16.1          py37_blas_openblash1522bff_0  [blas_openblas]  conda-forge
openblas                  0.3.3             h9ac9557_1001    conda-forge
openssl                   1.0.2p            h14c3975_1002    conda-forge
pip                       19.0.2                   py37_0    conda-forge
python                    3.7.1             hd21baee_1000    conda-forge
readline                  7.0               hf8c457e_1001    conda-forge
setuptools                40.8.0                   py37_0    conda-forge
sqlite                    3.26.0            h67949de_1000    conda-forge
tk                        8.6.9             h84994c4_1000    conda-forge
wheel                     0.33.0                   py37_0    conda-forge
xz                        5.2.4             h14c3975_1001    conda-forge
zlib                      1.2.11            h14c3975_1004    conda-forge


Details about conda and system ( conda info ):
$ conda info
     active environment : np
    active env location : /home/chris/miniconda3/envs/np
            shell level : 1
       user config file : /home/chris/.condarc
 populated config files : 
          conda version : 4.6.1
    conda-build version : 3.17.8
         python version : 3.6.8.final.0
       base environment : /home/chris/miniconda3  (writable)
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/free/linux-64
                          https://repo.anaconda.com/pkgs/free/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/chris/miniconda3/pkgs
                          /home/chris/.conda/pkgs
       envs directories : /home/chris/miniconda3/envs
                          /home/chris/.conda/envs
               platform : linux-64
             user-agent : conda/4.6.1 requests/2.18.4 CPython/3.6.8 Linux/4.15.0-45-generic ubuntu/16.04.5 glibc/2.23
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False

@ChristopherHogan
Copy link
Author

Still an issue after the BLAS migration, although now the segfault happens in libcblas.so.3 instead of libopenblas.so.0.

Thread 1 "python" received signal SIGILL, Illegal instruction.
0x00007ffff5cf25d8 in sdot_k_SKYLAKEX ()
   from /home/chris/miniconda3/envs/np/lib/python3.7/site-packages/numpy/core/../../../../libcblas.so.3

conda list

# Name                    Version                   Build  Channel
bzip2                     1.0.6             h14c3975_1002    conda-forge
ca-certificates           2019.3.9             hecc5488_0    conda-forge
certifi                   2019.3.9                 py37_0    conda-forge
libblas                   3.8.0                4_openblas    conda-forge
libcblas                  3.8.0                4_openblas    conda-forge
libffi                    3.2.1             he1b5a44_1006    conda-forge
libgcc-ng                 7.3.0                hdf63c60_0    conda-forge
libgfortran-ng            7.2.0                hdf63c60_3    conda-forge
liblapack                 3.8.0                4_openblas    conda-forge
libstdcxx-ng              7.3.0                hdf63c60_0    conda-forge
ncurses                   6.1               hf484d3e_1002    conda-forge
numpy                     1.16.2           py37h8b7e671_1    conda-forge
openblas                  0.3.5             h9ac9557_1001    conda-forge
openssl                   1.1.1b               h14c3975_1    conda-forge
pip                       19.0.3                   py37_0    conda-forge
python                    3.7.2                h381d211_0    conda-forge
readline                  7.0               hf8c457e_1001    conda-forge
setuptools                40.8.0                   py37_0    conda-forge
sqlite                    3.26.0            h67949de_1001    conda-forge
tk                        8.6.9             h84994c4_1000    conda-forge
wheel                     0.33.1                   py37_0    conda-forge
xz                        5.2.4             h14c3975_1001    conda-forge
zlib                      1.2.11            h14c3975_1004    conda-forge

@jschueller
Copy link
Contributor

runs fine here on ubuntu, I have a much older cpu than skylake though (something around 2011)

@ChristopherHogan
Copy link
Author

Yes, it seems specific to skylake.

@ChristopherHogan
Copy link
Author

It works if I force openblas 0.3.4:

$ conda create -n np -c conda-forge openblas=0.3.4 numpy
$ conda activate np
$ python -c 'import numpy'

@jschueller
Copy link
Contributor

maybe we should update openblas pin then conda-forge/conda-forge-pinning-feedstock#201

@isuruf
Copy link
Member

isuruf commented Mar 20, 2019

The new blas packages depend on openblas 0.3.5

@jschueller
Copy link
Contributor

jschueller commented Mar 20, 2019

oh i thought the issue was with 0.3.3 but the log shows 0.3.5 as you say, so that's maybe a regression in 0.3.5

@grlee77
Copy link
Member

grlee77 commented Mar 21, 2019

It sounds like this is probably the same issue seen in OpenMathLib/OpenBLAS#2067? The issue seems to have been not accounting for the fact that VMs can disable some features of the underlying CPU. If this is the same issue then it has already been fixed in OpenBLAS master.

You can set an environment variable as in that thread to work around the issue in the meantime.

@jschueller
Copy link
Contributor

it may be OpenMathLib/OpenBLAS#1949, one could try to backport it here

@ChristopherHogan
Copy link
Author

Setting the environment variable works. The issue seems to be that VirtualBox incorrectly detects my CPU as an Intel i7-6700K (which as no AVX512) and OpenBLAS correctly detects an i7-6820X (which has AVX512).

@prusswan
Copy link

Setting the environment variable works. The issue seems to be that VirtualBox incorrectly detects my CPU as an Intel i7-6700K (which as no AVX512) and OpenBLAS correctly detects an i7-6820X (which has AVX512).

How do you check this? Mine is a 7900X, so almost certainly the same issue

@ChristopherHogan
Copy link
Author

Got to Machine->Show Log, and filter for "CPUM".

@prusswan
Copy link

prusswan commented Mar 29, 2019

Confirmed. Just for the record, this solution (overriding environment variable) allows numpy to be loaded despite the wrong CPU detection:

>>> import os
>>> os.environ["OPENBLAS_CORETYPE"] = "nehalem"
>>> import numpy as np
>>>

I suppose the other solution is to downgrade to openblas < 0.3.5

conda install openblas=0.3.4 

I am going with the second solution since it is cleaner and I don't need the latest openblas (was on a much older version of openblas anyway, before whatever that triggered the upgrade of openblas)

@1kastner
Copy link

I added an environment variable to my Dockerfile in order to keep the code clean but in the end one must decide case-by-case about how to deal with that I guess.

@isuruf
Copy link
Member

isuruf commented Feb 20, 2020

Looks like we can't do anything here. Please open an issue upstream if the issue is still there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants