Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Python 3.12 #719

Merged
merged 7 commits into from
Sep 16, 2024
Merged

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Sep 9, 2024

Description

Contributes to rapidsai/build-planning#40

This PR adds support for Python 3.12.

Notes for Reviewers

This is part of ongoing work to add Python 3.12 support across RAPIDS.
It temporarily introduces a build/test matrix including Python 3.12, from rapidsai/shared-workflows#213.

A follow-up PR will revert back to pointing at the branch-24.10 branch of shared-workflows once all
RAPIDS repos have added Python 3.12 support.

This will fail until all dependencies have been updates to Python 3.12

CI here is expected to fail until all of this project's upstream dependencies support Python 3.12.

Blocked by:

This can be merged whenever all CI jobs are passing.

@bdice bdice marked this pull request as ready for review September 16, 2024 05:20
@bdice bdice requested a review from a team as a code owner September 16, 2024 05:20
@bdice bdice self-requested a review September 16, 2024 05:20
@bdice bdice changed the title WIP: Add support for Python 3.12 Add support for Python 3.12 Sep 16, 2024
@bdice
Copy link
Contributor

bdice commented Sep 16, 2024

CI will fail here until we build and publish new rapids metapackages. I will request an admin merge once builds complete.

For the future we could do any of the following:

  • not change the shared-workflows tag on the environment test job when adding new platforms (leave it as branch-24.10)
  • consider making rapids metapackages noarch (or compatible with a range of Python versions)
  • not run the environment test job on PRs

@jameslamb
Copy link
Member Author

It had been 2.5 hours since I restarted the build jobs here: https://github.com/rapidsai/integration/actions/runs/10878016199/job/30200340481?pr=719

Those jobs typically take more like 15-45 minutes (example recent successful build).

I just pushed a new commit restarting all of them... I suspected that maybe there were underlying issues on the runner, as I saw some of those jobs did not have any logs visible in the UI (which in the past has been a sign that the runner was out of memory).

@jameslamb jameslamb requested a review from a team as a code owner September 16, 2024 19:58
# * https://github.com/rapidsai/build-planning/issues/56
# * https://github.com/rapidsai/cuspatial/pull/1453#issuecomment-2335527542
- pyogrio <0.8
- tiledb <2.19
Copy link
Member Author

@jameslamb jameslamb Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like it's taking FOREVER (2+ hours) for conda to solve the runtime environment for the rapids package on Python 3.12.

I strongly suspect that the root cause is "RAPIDS is pinned to an old range of fmt / spdlog versions that causes the solver to have to backtrack a lot to find older versions of spatial dependencies".

The same issue we saw in rapidsai/cuspatial#1453 (comment).

I found that when I explicitly pinned pyogrio and tiledb to versions that were successfully found in recent cuspatial conda test jobs (build link), I can get a successful solve in just a few minutes.

I think doing that here should be ok... this is just helping the solver get to a workable solution faster, and it could be removed once we update the fmt / spdlog pins across RAPIDS (which is in-progress in rapidsai/build-planning#56 and will hopefully be done within the 24.10 release).

Tried the following, pinning a bunch of RAPIDS libraries to their latest nightly versions (to avoid backtracking to other pre-Python-3.12 nightlies), on an x86_64 machine with CUDA 12.2.

conda create \
    --name delete-me \
    --dry-run \
    -v \
    --override-channels \
    --channel rapidsai \
    --channel rapidsai-nightly \
    --channel conda-forge \
    --channel nvidia \
        'cuda-version=12.2' \
        'cudf==24.10.00a320' \
        'cugraph==24.10.00a74' \
        'cuml==24.10.00a55' \
        'cuproj==24.10.00a41' \
        'cupy==13.3.0' \
        'cuspatial==24.10.00a41' \
        'cuxfilter==24.10.00a19' \
        'dask-cudf==24.10.00a320' \
        'libcudf==24.10.00a320' \
        'libraft==24.10.00a32' \
        'libraft-headers==24.10.00a32' \
        'librmm==24.10.00a38' \
        'nccl==2.22.3.1' \
        'numba==0.60.0' \
        'numpy==1.26.4' \
        'nvtx==0.2.10' \
        'pylibcudf==24.10.00a320' \
        'pylibraft==24.10.00a32' \
        'pynvjitlink==0.3.0' \
        'pyogrio<0.8' \
        'python=3.12' \
        'rmm==24.10.00a38' \
        'tiledb<2.19' \
        'conda-forge::ucx==1.17.0' \
        'ucx-py==0.40.00a12'

Copy link
Member Author

@jameslamb jameslamb Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With just the tiledb + pyogrio pins alone here, I still saw 3 of the 4 Python 3.12 builds not making progress after 45 minutes. Just pushed 57edada putting floors on all the RAPIDS libraries, to see if preventing the solver from falling back to earlier nightlies helps.

It should cut the search space significantly, I hope... there have been 320 cudf releases within the 24.10 cycle so far, for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the combination of those two changes helped!

conda was able to solve the runtime environment for all the environments we test here: https://github.com/rapidsai/integration/actions/runs/10891738111/job/30223298780?pr=719.

In a manageable amount of time:

  • (amd64, 3.12, 11.8.0, ubuntu22.04) = 30m33s
  • (amd64, 3.12, 12.5.1, ubuntu22.04) = 17m36s
  • (arm64, 3.12, 11.8.0, ubuntu22.04) = 23m08s
  • (arm64, 3.12, 12.5.1, ubuntu22.04) = 19m39s

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It surprises me that we need floors on specific alphas, but as described above, it seems necessary until we update fmt/spdlog pinnings. I would accept this for now, we just have to get it fixed up before code freeze.

@jameslamb
Copy link
Member Author

I would accept this for now, we just have to get it fixed up before code freeze.

Thanks! I've added a tasklist item in rapidsai/build-planning#56 to be sure we remember to revert these pins as part of the fmt / spdlog effort.

I'll go ask for an admin merge, per #719 (comment).

@raydouglass raydouglass merged commit 77b6660 into rapidsai:branch-24.10 Sep 16, 2024
24 of 28 checks passed
@jameslamb jameslamb removed the 2 - In Progress Currenty a work in progress label Sep 16, 2024
@jameslamb jameslamb deleted the python-3.12 branch September 16, 2024 22:05
rapids-bot bot pushed a commit to rapidsai/docker that referenced this pull request Sep 17, 2024
…711)

Contributes to rapidsai/build-planning#40.

* adds Python 3.12 images
* defaults to latest Python (3.12) and CUDA (12.5[.1]) in docs and comments

## Notes for Reviewers

Builds here will fail until all RAPIDS libraries are supporting Python 3.12, but figured we don't need to wait on that to come to an agreement about the building and testing matrices.

Blocked by:

* [x] rapidsai/cuml#6060
* [x] rapidsai/cugraph#4647
* [x] rapidsai/integration#719

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #711
rapids-bot bot pushed a commit that referenced this pull request Sep 24, 2024
Contributes to rapidsai/build-planning#56

With rapidsai/cuspatial#1441, it should be possible to revert some of the workarounds introduced in #719.

## Notes for Reviewers

### How to test this

if this is working, we should see the following in the conda solves:

* `fmt >=11.0.2`
* `spdlog >=1.14.1`

We won't see `numpy >=2` yet, because `cugraph` doesn't support it yet (rapidsai/cugraph#4615).

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #722
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants