Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow cuda-python 12.6.1 and 11.8.4 #1720

Merged
merged 8 commits into from
Nov 6, 2024

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Nov 6, 2024

Due to a bug in cuda-python we must disallow cuda-python 12.6.1 and 11.8.4. See rapidsai/build-planning#116 for more information.

This PR disallows those versions, and other changes following from that:

  • specifying python in both host: and run: dependencies for the rmm conda package
  • ignoring deprecation warnings raised by newer versions of cuda-python

@bdice bdice requested a review from a team as a code owner November 6, 2024 12:39
@bdice bdice added non-breaking Non-breaking change bug Something isn't working labels Nov 6, 2024
@bdice bdice requested a review from raydouglass November 6, 2024 12:39
@github-actions github-actions bot added Python Related to RMM Python API conda labels Nov 6, 2024
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like cuda-python=12.6.1 is still making it into the test environment for rmm conda builds on CUDA 12.

TEST START: /tmp/conda-bld-output/linux-64/rmm-24.12.00a25-cuda12_py311_241106_gec071874_25.conda
...
The following NEW packages will be INSTALLED:
...        
 cuda-python:          12.6.1-py311h817de4b_0                       conda-forge
...
AttributeError: module 'cuda.ccudart' has no attribute '__pyx_capi__'

(build link)

I guess because cuda-python has a run export like this:

  run_exports:
    - {{ pin_subpackage('cuda-python', min_pin='x', max_pin='x') }}

(code link)

Think we probably need to ignore run exports from cuda-python and make the run: dependency explicit?

Or maybe we can just add the != pins explicitly in run: and leave the run export as-is? I'm not sure if those things can be mixed like that.

@jameslamb jameslamb removed the request for review from raydouglass November 6, 2024 15:44
conda/recipes/rmm/meta.yaml Outdated Show resolved Hide resolved
@leofang
Copy link
Member

leofang commented Nov 6, 2024

Think we probably need to ignore run exports from cuda-python and make the run: dependency explicit?

Or maybe we can just add the != pins explicitly in run: and leave the run export as-is? I'm not sure if those things can be mixed like that.

My guess is you just need a wildcard, like cuda-python !=11.8.4.*

@jameslamb
Copy link
Member

With the updated pins, conda build jobs are now getting cuda-python=12.6.0 at runtime, as we'd expect. But they're failing import tests:

Traceback (most recent call last):
  File "/opt/conda/conda-bld/test_tmp/run_test.py", line 2, in <module>
    import rmm
ModuleNotFoundError: No module named 'rmm'

(build link)

I strongly suspect that there's some other import error just not making its way to the logs. Investigating.


Wheel tests also are now failing because of this new deprecation warning treated as an error in CI, coming from cuda-python=11.8.5 (uploaded to PyPI about 10 hours ago)

E DeprecationWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead

(build link)

I'll push a fix to ignore that (for CI purposes) for now, and open an issue about updating if we don't already have one.

@jameslamb
Copy link
Member

Looking more closely... looks like all Python 3.10 / 3.11 conda-python-build jobs are failing, but all Python 3.12 conda-python-build jobs are passing. They all appear to be getting the same versions of cuda-python (11.8.3=*_3 for the CUDA 11 jobs, 12.6.0=*_2 for the CUDA 12 jobs).

@jameslamb
Copy link
Member

I can reproduce the conda build failure locally (on an x86_64 machine):

docker run \
    --rm \
    -v $(pwd):/opt/work \
    -w /opt/work \
    --env CMAKE_GENERATOR=Ninja \
    --env RAPIDS_PACKAGE_VERSION=24.12.00a24 \
    --env RAPIDS_BUILD_TYPE=nightly \
    --env RAPIDS_REPOSITORY=rapidsai/rmm \
    --env RAPIDS_REF_NAME=branch-24.12 \
    --env RAPIDS_SHA=dbae8c0 \
    --env RAPIDS_NIGHTLY_DATE=2024-11-05 \
    -it rapidsai/ci-conda:cuda11.8.0-rockylinux8-py3.10 \
    bash

source rapids-date-string
CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
conda mambabuild \
    --channel "${CPP_CHANNEL}" \
    conda/recipes/rmm

Tried instead running that build with --no-test and the installing the package.

conda mambabuild \
    --channel "${CPP_CHANNEL}" \
    conda/recipes/rmm

conda install \
    --channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}" \
    rmm="${RAPIDS_PACKAGE_VERSION}"

That's definitely picking up the package just built locally...

...
  rmm                conda-bld-output/linux-64::rmm-24.12.00a24-cuda11_py310_241106_g84765d34_100
...

... but I can't reproduce the import error.

python -c "import rmm; print(rmm.__git_commit__)"
# 84765d347813b0296ed66daf81cf33ad1639d46a

So I'm thinking it has to be something specific to the test environment conda-build is creating.

@jameslamb
Copy link
Member

blegh there are even more deprecation warnings causing the wheel-tests jobs to fail

E DeprecationWarning: The cuda.cuda module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.driver module instead.

(build link)

Notice that's about cuda.cuda, the one in #1720 (comment) was about cuda.cudart. Pushed making the warning-suppression broader: 641a866

@jameslamb jameslamb self-requested a review November 6, 2024 20:24
@jameslamb
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit d4c0635 into rapidsai:branch-24.12 Nov 6, 2024
58 checks passed
@jakirkham
Copy link
Member

Thanks all! 🙏

@jameslamb jameslamb mentioned this pull request Nov 7, 2024
3 tasks
rapids-bot bot pushed a commit that referenced this pull request Nov 7, 2024
Follow-up to #1720

Contributes to rapidsai/build-planning#116

That PR used `!=` requirements to skip a particular version of `cuda-python` that `rmm` was incompatible with. A newer version of `cuda-python` (12.6.2 for CUDA 12, 11.8.5 for CUDA 11) was just released, and it also causes some build issues for RAPIDS libraries: rapidsai/cuvs#445 (comment)

To unblock CI across RAPIDS, this proposes **temporarily** switching to ceilings on `rmm`'s `cuda-python` dependency.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #1723
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working conda non-breaking Non-breaking change Python Related to RMM Python API
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants