Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KG build failing in transform step due to numpy issues #155

Closed
caufieldjh opened this issue Sep 27, 2024 · 3 comments · Fixed by #156
Closed

KG build failing in transform step due to numpy issues #155

caufieldjh opened this issue Sep 27, 2024 · 3 comments · Fixed by #156

Comments

@caufieldjh
Copy link
Contributor

As of the most recent build (121), the KG build fails at the beginning of the transform step (long stack trace follows):

12:05:31  + python3.9 run.py transform
12:05:32  
12:05:32  A module that was compiled using NumPy 1.x cannot be run in
12:05:32  NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
12:05:32  versions of NumPy, modules must be compiled with NumPy 2.0.
12:05:32  Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
12:05:32  
12:05:32  If you are a user of the module, the easiest solution will be to
12:05:32  downgrade to 'numpy<2' or try to upgrade the affected module.
12:05:32  We expect that some modules will need time to support NumPy 2.
12:05:32  
12:05:32  Traceback (most recent call last):  File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/run.py", line 5, in <module>
12:05:32      from kg_phenio import download as kg_download
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/__init__.py", line 3, in <module>
12:05:32      from .transform import transform
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform.py", line 5, in <module>
12:05:32      from kg_phenio.transform_utils.phenio.phenio_transform import PhenioTransform
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform_utils/phenio/__init__.py", line 2, in <module>
12:05:32      from .phenio_transform import PhenioTransform
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform_utils/phenio/phenio_transform.py", line 8, in <module>
12:05:32      from kgx.cli.cli_utils import transform  # type: ignore
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/cli/__init__.py", line 7, in <module>
12:05:32      from kgx.cli.cli_utils import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/cli/cli_utils.py", line 11, in <module>
12:05:32      from kgx.validator import Validator
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/validator.py", line 14, in <module>
12:05:32      from kgx.utils.kgx_utils import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/utils/kgx_utils.py", line 22, in <module>
12:05:32      import pandas as pd
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pandas/__init__.py", line 26, in <module>
12:05:32      from pandas.compat import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pandas/compat/__init__.py", line 27, in <module>
12:05:32      from pandas.compat.pyarrow import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pandas/compat/pyarrow.py", line 8, in <module>
12:05:32      import pyarrow as pa
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pyarrow/__init__.py", line 65, in <module>
12:05:32      import pyarrow.lib as _lib
12:05:32  AttributeError: _ARRAY_API not found
12:05:32  
12:05:32  A module that was compiled using NumPy 1.x cannot be run in
12:05:32  NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
12:05:32  versions of NumPy, modules must be compiled with NumPy 2.0.
12:05:32  Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
12:05:32  
12:05:32  If you are a user of the module, the easiest solution will be to
12:05:32  downgrade to 'numpy<2' or try to upgrade the affected module.
12:05:32  We expect that some modules will need time to support NumPy 2.
12:05:32  
12:05:32  Traceback (most recent call last):  File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/run.py", line 5, in <module>
12:05:32      from kg_phenio import download as kg_download
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/__init__.py", line 3, in <module>
12:05:32      from .transform import transform
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform.py", line 5, in <module>
12:05:32      from kg_phenio.transform_utils.phenio.phenio_transform import PhenioTransform
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform_utils/phenio/__init__.py", line 2, in <module>
12:05:32      from .phenio_transform import PhenioTransform
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform_utils/phenio/phenio_transform.py", line 8, in <module>
12:05:32      from kgx.cli.cli_utils import transform  # type: ignore
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/cli/__init__.py", line 7, in <module>
12:05:32      from kgx.cli.cli_utils import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/cli/cli_utils.py", line 11, in <module>
12:05:32      from kgx.validator import Validator
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/validator.py", line 14, in <module>
12:05:32      from kgx.utils.kgx_utils import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/utils/kgx_utils.py", line 22, in <module>
12:05:32      import pandas as pd
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pandas/__init__.py", line 49, in <module>
12:05:32      from pandas.core.api import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pandas/core/api.py", line 9, in <module>
12:05:32      from pandas.core.dtypes.dtypes import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pandas/core/dtypes/dtypes.py", line 24, in <module>
12:05:32      from pandas._libs import (
12:05:32    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pyarrow/__init__.py", line 65, in <module>
12:05:32      import pyarrow.lib as _lib
12:05:32  AttributeError: _ARRAY_API not found
12:05:33  
12:05:33  A module that was compiled using NumPy 1.x cannot be run in
12:05:33  NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
12:05:33  versions of NumPy, modules must be compiled with NumPy 2.0.
12:05:33  Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
12:05:33  
12:05:33  If you are a user of the module, the easiest solution will be to
12:05:33  downgrade to 'numpy<2' or try to upgrade the affected module.
12:05:33  We expect that some modules will need time to support NumPy 2.
12:05:33  
12:05:33  Traceback (most recent call last):  File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/run.py", line 5, in <module>
12:05:33      from kg_phenio import download as kg_download
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/__init__.py", line 3, in <module>
12:05:33      from .transform import transform
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform.py", line 5, in <module>
12:05:33      from kg_phenio.transform_utils.phenio.phenio_transform import PhenioTransform
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform_utils/phenio/__init__.py", line 2, in <module>
12:05:33      from .phenio_transform import PhenioTransform
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform_utils/phenio/phenio_transform.py", line 8, in <module>
12:05:33      from kgx.cli.cli_utils import transform  # type: ignore
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/cli/__init__.py", line 7, in <module>
12:05:33      from kgx.cli.cli_utils import (
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/cli/cli_utils.py", line 12, in <module>
12:05:33      from kgx.sink import Sink
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/sink/__init__.py", line 7, in <module>
12:05:33      from .parquet_sink import ParquetSink
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/sink/parquet_sink.py", line 7, in <module>
12:05:33      from pyarrow import Table
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pyarrow/__init__.py", line 65, in <module>
12:05:33      import pyarrow.lib as _lib
12:05:33  AttributeError: _ARRAY_API not found
12:05:33  Traceback (most recent call last):
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/run.py", line 5, in <module>
12:05:33      from kg_phenio import download as kg_download
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/__init__.py", line 3, in <module>
12:05:33      from .transform import transform
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform.py", line 5, in <module>
12:05:33      from kg_phenio.transform_utils.phenio.phenio_transform import PhenioTransform
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform_utils/phenio/__init__.py", line 2, in <module>
12:05:33      from .phenio_transform import PhenioTransform
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/kg_phenio/transform_utils/phenio/phenio_transform.py", line 8, in <module>
12:05:33      from kgx.cli.cli_utils import transform  # type: ignore
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/cli/__init__.py", line 7, in <module>
12:05:33      from kgx.cli.cli_utils import (
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/cli/cli_utils.py", line 12, in <module>
12:05:33      from kgx.sink import Sink
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/sink/__init__.py", line 7, in <module>
12:05:33      from .parquet_sink import ParquetSink
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/kgx/sink/parquet_sink.py", line 7, in <module>
12:05:33      from pyarrow import Table
12:05:33    File "/var/lib/jenkins/workspace/ledge-graph-hub_kg-phenio_master/gitrepo/venv/lib/python3.9/site-packages/pyarrow/__init__.py", line 65, in <module>
12:05:33      import pyarrow.lib as _lib
12:05:33    File "pyarrow/lib.pyx", line 36, in init pyarrow.lib
12:05:33  ImportError: numpy.core.multiarray failed to import

Numpy is an upstream dependency here but some dependencies already have it pinned to <2.
So the easiest option would also be to pin numpy<2.

@caufieldjh
Copy link
Contributor Author

This issue did not appear with the Oct 8 build but it reappeared on the Nov 1 build. Strange!

@matentzn
Copy link

matentzn commented Nov 4, 2024

I struggled a lot with this over last week (see my post in #codestyle where I was desperately trying to ensure pandas 1.x compatibility with sssom); but I think since you are using python 3.9 these might only be tangentially related.

@caufieldjh caufieldjh linked a pull request Nov 4, 2024 that will close this issue
@caufieldjh
Copy link
Contributor Author

I fixed it by pinning numpy<2 for now - obviously that won't work forever

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants