Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce libraft wheels #2531

Merged
merged 42 commits into from
Jan 16, 2025
Merged

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Dec 17, 2024

Replaces #2306, contributes to rapidsai/build-planning#33.

Proposes packaging libraft as a wheel, which is then re-used by:

As part of this, also proposes:

  • introducing a new CMake option, RAFT_COMPILE_DYNAMIC_ONLY, to allow building/installing only the dynamic shared library (i.e. skipping the static library)
  • enforcing rapids-cmake's preferred CMake style (introduce libraft wheels #2531 (comment))
  • making wheel-building CI jobs always depend on other wheel-building CI jobs, not tests or *-publish (to reduce end-to-end CI time)

Notes for Reviewers

Benefits of these changes

Wheel contents

libraft:

  • libraft.so (shared library)
  • RAFT headers
  • vendored dependencies (fmt, CCCL, cuco, cute, cutlass)

pylibraft:

  • pylibraft Python / Cython code and compiled Cython extensions

raft-dask:

  • raft-dask Python / Cython code and compiled Cython extension

Dependency Flows

In short.... libraft contains a libraft.so dynamic library and the headers to link against it.

  • Anything that needs to link against RAFT at build time pulls in libraft wheels as a build dependency.
  • Anything that needs RAFT's symbols at runtime pulls it in as a runtime dependency, and calls libraft.load_library().

For more details and some flowcharts, see rapidsai/build-planning#33 (comment)

Size changes (CUDA 12, Python 3.12, x86_64)

wheel num files (before) num files (these PRs) size (before) size (these PRs)
libraft. --- 3169 --- 19M
pylibraft 64 63 11M 1M
raft-dask 29 28 188M 188M
libcugraph --- 1762 --- 903M
pylibcugraph 190 187 901M 2M
cugraph 315 313 899M 3.0M
libcuml --- 1766 --- 289M
cuml 442 --- 517M ---
TOTAL 1,040 7,268 2,516M 1,405M

NOTES: size = compressed, "before" = 2025-01-13 nightlies

how I calculated those (click me)
docker run \
    --rm \
    --network host \
    --env RAPIDS_NIGHTLY_DATE=2025-01-13 \
    --env CUGRAPH_NIGHTLY_SHA=8507cbf63db2f349136b266d3e6e787b189f45a0 \
    --env CUGRAPH_PR="pull-request/4804" \
    --env CUGRAPH_PR_SHA="2ef32eaa006a84c0bd16220bb8e8af34198fbee8" \
    --env CUML_NIGHTLY_SHA=7c715c494dff71274d0fdec774bdee12a7e78827 \
    --env CUML_PR="pull-request/6199" \
    --env CUML_PR_SHA="2ef32eaa006a84c0bd16220bb8e8af34198fbee8" \
    --env RAFT_NIGHTLY_SHA=1b62c4117a35b11ce3c830daae248e32ebf75e3f \
    --env RAFT_PR="pull-request/2531" \
    --env RAFT_PR_SHA="0d6597b08919f2aae8ac268f1a68d6a8fe5beb4e" \
    --env RAPIDS_PY_CUDA_SUFFIX=cu12 \
    --env WHEEL_DIR_BEFORE=/tmp/wheels-before \
    --env WHEEL_DIR_AFTER=/tmp/wheels-after \
    -it rapidsai/ci-wheel:cuda12.5.1-rockylinux8-py3.12 \
    bash

# --- nightly wheels --- #
mkdir -p ./wheels-before

export RAPIDS_BUILD_TYPE=branch
export RAPIDS_REF_NAME="branch-25.02"

# pylibraft
RAPIDS_PY_WHEEL_NAME="pylibraft_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/raft \
RAPIDS_SHA=${RAFT_NIGHTLY_SHA} \
    rapids-download-wheels-from-s3 python ./wheels-before

# raft-dask
RAPIDS_PY_WHEEL_NAME="raft_dask_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/raft \
RAPIDS_SHA=${RAFT_NIGHTLY_SHA} \
    rapids-download-wheels-from-s3 python ./wheels-before

# cugraph
RAPIDS_PY_WHEEL_NAME="cugraph_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cugraph \
RAPIDS_SHA=${CUGRAPH_NIGHTLY_SHA} \
    rapids-download-wheels-from-s3 python ./wheels-before

# pylibcugraph
RAPIDS_PY_WHEEL_NAME="pylibcugraph_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cugraph \
RAPIDS_SHA=${CUGRAPH_NIGHTLY_SHA} \
    rapids-download-wheels-from-s3 python ./wheels-before

# cuml
RAPIDS_PY_WHEEL_NAME="cuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_SHA=${CUML_NIGHTLY_SHA} \
    rapids-download-wheels-from-s3 python ./wheels-before

# --- wheels from CI --- #
mkdir -p ./wheels-after

export RAPIDS_BUILD_TYPE="pull-request"

# libraft
RAPIDS_PY_WHEEL_NAME="libraft_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/raft \
RAPIDS_REF_NAME="${RAFT_PR}" \
RAPIDS_SHA="${RAFT_PR_SHA}" \
    rapids-download-wheels-from-s3 cpp ./wheels-after

# pylibraft
RAPIDS_PY_WHEEL_NAME="pylibraft_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/raft \
RAPIDS_REF_NAME="${RAFT_PR}" \
RAPIDS_SHA="${RAFT_PR_SHA}" \
    rapids-download-wheels-from-s3 python ./wheels-after

# raft-dask
RAPIDS_PY_WHEEL_NAME="raft_dask_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/raft \
RAPIDS_REF_NAME="${RAFT_PR}" \
RAPIDS_SHA="${RAFT_PR_SHA}" \
    rapids-download-wheels-from-s3 python ./wheels-after

# libcugraph
RAPIDS_PY_WHEEL_NAME="libcugraph_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cugraph \
RAPIDS_REF_NAME="${CUGRAPH_PR}" \
RAPIDS_SHA="${CUGRAPH_PR_SHA}" \
    rapids-download-wheels-from-s3 cpp ./wheels-after

# pylibcugraph
RAPIDS_PY_WHEEL_NAME="pylibcugraph_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cugraph \
RAPIDS_REF_NAME="${CUGRAPH_PR}" \
RAPIDS_SHA="${CUGRAPH_PR_SHA}" \
    rapids-download-wheels-from-s3 python ./wheels-after

# cugraph
RAPIDS_PY_WHEEL_NAME="cugraph_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cugraph \
RAPIDS_REF_NAME="${CUGRAPH_PR}" \
RAPIDS_SHA="${CUGRAPH_PR_SHA}" \
    rapids-download-wheels-from-s3 python ./wheels-after

# libcuml
RAPIDS_PY_WHEEL_NAME="libcuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_REF_NAME="${CUML_PR}" \
RAPIDS_SHA="${CUML_PR_SHA}" \
    rapids-download-wheels-from-s3 cpp ./wheels-after

# cuml
RAPIDS_PY_WHEEL_NAME="cuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_REF_NAME="${CUML_PR}" \
RAPIDS_SHA="${CUML_PR_SHA}" \
    rapids-download-wheels-from-s3 python ./wheels-after

pip install pydistcheck
pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ./wheels-before/*.whl \
| grep -E '^checking|files: | compressed' \
> ./before.txt

# get more exact sizes
du -sh ./wheels-before/*

pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ./wheels-after/*.whl \
| grep -E '^checking|files: | compressed' \
> ./after.txt

# get more exact sizes
du -sh ./wheels-after/*

How I tested this

These other PRs:

@jameslamb jameslamb added 5 - DO NOT MERGE Hold off on merging; see PR for details improvement Improvement / enhancement to an existing function non-breaking Non-breaking change 2 - In Progress Currenty a work in progress labels Dec 17, 2024

This comment was marked as resolved.

@jameslamb jameslamb changed the title WIP: introduce libraft wheels WIP: [DO NOT MERGE] introduce libraft wheels Dec 17, 2024
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. Some things to look into briefly before we finalize. I would also cross-check with #2264 to make sure that nothing was missed here.

- wheel-build-pylibraft
- wheel-tests-pylibraft
- wheel-build-raft-dask
- wheel-tests-raft-dask
- devcontainer
# - devcontainer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Put this back before merging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed 93f39fb restoring the devcontainer builds.

They'll still fail because the manifest in devcontainers needs to be updated. Summarizing plan we agreed to offline:

  1. get devcontainers PR passing CI, pointed at this branch from my fork (add libraft wheels devcontainers#438)
  2. admin-merge this PR
  3. point that devcontainers PR back at rapidsai/raft, observe it passing CI, merge it normally

.github/workflows/pr.yaml Show resolved Hide resolved
;;
esac

export SKBUILD_CMAKE_ARGS="-DDETECT_CONDA_ENV=OFF${EXTRA_CMAKE_ARGS}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I don't think we need to be setting this explicitly almost ever. All our wheel builds occur in environments where we don't have a conda environment. I'd prefer to remove this altogether.

If we did keep it though, minor nit: I would prefer to have an explicit semicolon here rather than embedding it in each incantation of EXTRA_CMAKE_ARGS above.

Suggested change
export SKBUILD_CMAKE_ARGS="-DDETECT_CONDA_ENV=OFF${EXTRA_CMAKE_ARGS}"
export SKBUILD_CMAKE_ARGS="${EXTRA_CMAKE_ARGS}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that's fine, I can try removing it. Both of those patterns (setting -DDETECT_CONDA_ENV and the placement of the semicolon) are used all over RAPIDS wheel-building CI scripts (not newly introduced in this PR), so if that works here and we likes it I'll make other PRs across RAPIDS changing that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thanks. I'll defer to you on when and how you want to propagate these changes through the rest of RAPIDS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when and how you want to propagate these changes through the rest of RAPIDS

Put up this tracking issue: rapidsai/build-planning#136

I'm going to try to do that in the next day or 2, to try to get it into 25.02.

wheel-build-pylibraft:
needs: wheel-publish-libraft
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job downloads from s3, so really this can happen after the libraft build is complete and doesn't have to wait for publish, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point!!!

I'll change that here, and put up PRs across RAPIDS with changes like that.

Across RAPIDS, we have wheel-build-* depending on wheel-publish-* of its dependencies.

https://github.com/rapidsai/cudf/blob/e272f1e60978801809945c59240dfb02f0aa3b07/.github/workflows/build.yaml#L92-L93

https://github.com/rapidsai/cugraph/blob/6b5db94d9b6ebee41577222d230a36675d3e3ff4/.github/workflows/build.yaml#L90-L91

That extra parallelism should reduce the end-to-end time for publishing nightlies 😁

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thanks! I'll let you handle doing this elsewhere as time permits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No prob, no prob. I'm gonna wait until we've finished rapidsai/cugraph#4804, in case that uncovers other this-should-change-across-all-of-RAPIDS things, then do a sweep of 1 PR per repo with these changes where necessary.

I'm keeping a list, so far I have:

  • nightly / branch builds depending on *-build not *-publish
  • PR builds depending on *-build not *-tests
  • removing -DDETECT_CONDA_ENV=OFF from wheel-building scripts
  • enforcing rapids-cmake's preferred CMake formatting (introduce libraft wheels #2531 (comment))

ci/build_wheel_pylibraft.sh Outdated Show resolved Hide resolved
ci/validate_wheel.sh Outdated Show resolved Hide resolved
cpp/CMakeLists.txt Show resolved Hide resolved
python/libraft/CMakeLists.txt Show resolved Hide resolved
@jameslamb
Copy link
Member Author

😭 one of the earlier commits on this is unverified, so I'm gonna have to trigger CI with comments. Sorry for the notification noise, idk what happened there.

Screenshot 2025-01-15 at 11 11 08 AM

@jameslamb
Copy link
Member Author

/ok to test

@vyasr
Copy link
Contributor

vyasr commented Jan 15, 2025

😭 one of the earlier commits on this is unverified, so I'm gonna have to trigger CI with comments. Sorry for the notification noise, idk what happened there.

https://git-scm.com/docs/git-rebase#Documentation/git-rebase.txt---signoff

@bdice
Copy link
Contributor

bdice commented Jan 16, 2025

Approved additional changes since my last review.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of threads unresolved that require some follow-up work from you, but otherwise this PR LGTM now pending the devcontainer updates.

@jameslamb
Copy link
Member Author

/ok to test

@jameslamb jameslamb removed 2 - In Progress Currenty a work in progress 5 - DO NOT MERGE Hold off on merging; see PR for details labels Jan 16, 2025
@jameslamb jameslamb removed the request for review from KyleFromNVIDIA January 16, 2025 22:28
Copy link
Member

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Comment on lines +48 to +49
# To use a different ucxx locally, set the CMake variable
# CPM_ucxx_SOURCE=/path/to/local/ucxx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for fixing - looks like a cut and paste error

@raydouglass raydouglass merged commit 8299f17 into rapidsai:branch-25.02 Jan 16, 2025
76 of 78 checks passed
bdice added a commit to rapidsai/devcontainers that referenced this pull request Jan 17, 2025
Pulling just the RAFT-related changes off of #438

Complements rapidsai/raft#2531

## Notes for Reviewers

Proposing:

1. see this pass CI pointed at my fork (branch from
rapidsai/raft#2531)
2. admin-merge rapidsai/raft#2531
3. point this back at upstream `rapidsai/raft` and see it pass CI
4. merge

---------

Co-authored-by: Bradley Dice <[email protected]>
@jameslamb jameslamb deleted the libraft-wheels branch January 17, 2025 17:09
rapids-bot bot pushed a commit to rapidsai/cugraph that referenced this pull request Jan 18, 2025
Replaces #4340, contributes to rapidsai/build-planning#33.

Proposes packaging `libcugraph` as a wheel, which is then re-used by `cugraph-cu{11,12}` and `pylibcugraph-cu{11,12}` wheels.

## Notes for Reviewers

### Benefits of these changes

* smaller wheels (see "Size Changes" below)
  - *no more `pylibcugraph` and `cugraph` both holding copies of libcugraph.so*
* faster compile times
  - *no more re-compiling RAFT, thanks to rapidsai/raft#2531
  - *no more recompiling libcugraph.so in both `pylibcugraph` and `cugraph` wheel builds*
* other benefits mentioned in rapidsai/build-planning#33

### Wheel contents

`libcugraph`:

* `libcugraph.so` (shared library)
* cuGraph headers
* vendored dependencies (`fmt`, `spdlog`, CCCL, `cuco`)

`pylibcugraph`:

* `pylibcugraph` Python / Cython code and compiled Cython extensions

`cugraph`:

* `cugraph` Python / Cython code and compiled Cython extension

### Dependency Flows

In short.... `libcugraph` contains `libcugraph.so` and `libcugraph_c.so` dynamic libraries and the headers to link against it.

* Anything that needs to link against cuGraph at build time pulls in `libcugraph` wheels as a build dependency.
* Anything that needs cuGraph's symbols at runtime pulls it in as a runtime dependency, and calls `libcugraph.load_library()`.

For more details and some flowcharts, see rapidsai/build-planning#33 (comment)

### Size changes (CUDA 12, Python 3.12, x86_64)

| wheel                | num files (before) | num files (this PR) | size (before)  | size (this PR) |
|:---------------:|------------------:|-----------------:|--------------:|-------------:|
| `libcugraph`     |   ---                       |  1762                     | ---                   | 903M       |
| `pylibcugraph` |  190                      |   187                       | 901M              | 2M                 |
| `cugraph`         |  315                      |   313                      | 899M              | 3M                 |
|**TOTAL**          |   **505**         |   **2,262**               | **1,800M**                 | **908M**    |

*NOTES: size = compressed, "before" = 2025-01-13 nightlies*

*This is a cuGraph-specific slice of the table from rapidsai/raft#2531. See that PR for details.*

### How I tested this

These other PRs:

* rapidsai/devcontainers#435
* rapidsai/cugraph-gnn#110

Authors:
  - James Lamb (https://github.com/jameslamb)
  - Ralph Liu (https://github.com/nv-rliu)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Bradley Dice (https://github.com/bdice)

URL: #4804
rapids-bot bot pushed a commit that referenced this pull request Jan 22, 2025
Contributes to rapidsai/build-planning#137

Follow-up to #2531 .

See the linked issue for many more details, but in short... using a dynamically-loaded libraft which has statically-linked cuBLAS causes issues for other libraries.

There are now aarch64 CUDA 11 wheels for cuBLAS and other CUDA libraries, so it's possible to have RAFT wheels dynamically link against them. This PR does that.

## Notes for Reviewers

This has other side benefits in addition to fixing runtime issues... it also simplifies the wheel-building scripts and CMake, and makes CUDA 11 wheels noticeably smaller 😊

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #2548
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CMake cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants