-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce libraft wheels #2531
introduce libraft wheels #2531
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. Some things to look into briefly before we finalize. I would also cross-check with #2264 to make sure that nothing was missed here.
.github/workflows/pr.yaml
Outdated
- wheel-build-pylibraft | ||
- wheel-tests-pylibraft | ||
- wheel-build-raft-dask | ||
- wheel-tests-raft-dask | ||
- devcontainer | ||
# - devcontainer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Put this back before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed 93f39fb restoring the devcontainer builds.
They'll still fail because the manifest in devcontainers needs to be updated. Summarizing plan we agreed to offline:
- get devcontainers PR passing CI, pointed at this branch from my fork (add libraft wheels devcontainers#438)
- admin-merge this PR
- point that devcontainers PR back at
rapidsai/raft
, observe it passing CI, merge it normally
ci/build_wheel_libraft.sh
Outdated
;; | ||
esac | ||
|
||
export SKBUILD_CMAKE_ARGS="-DDETECT_CONDA_ENV=OFF${EXTRA_CMAKE_ARGS}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I don't think we need to be setting this explicitly almost ever. All our wheel builds occur in environments where we don't have a conda environment. I'd prefer to remove this altogether.
If we did keep it though, minor nit: I would prefer to have an explicit semicolon here rather than embedding it in each incantation of EXTRA_CMAKE_ARGS
above.
export SKBUILD_CMAKE_ARGS="-DDETECT_CONDA_ENV=OFF${EXTRA_CMAKE_ARGS}" | |
export SKBUILD_CMAKE_ARGS="${EXTRA_CMAKE_ARGS}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that's fine, I can try removing it. Both of those patterns (setting -DDETECT_CONDA_ENV
and the placement of the semicolon) are used all over RAPIDS wheel-building CI scripts (not newly introduced in this PR), so if that works here and we likes it I'll make other PRs across RAPIDS changing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing it worked for me locally. Did that in 1d9e1e9#diff-bf2a37f5233778a838d488772273c05401ae172253b19533bc07e7ca958ba271.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks. I'll defer to you on when and how you want to propagate these changes through the rest of RAPIDS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when and how you want to propagate these changes through the rest of RAPIDS
Put up this tracking issue: rapidsai/build-planning#136
I'm going to try to do that in the next day or 2, to try to get it into 25.02.
.github/workflows/build.yaml
Outdated
wheel-build-pylibraft: | ||
needs: wheel-publish-libraft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The job downloads from s3, so really this can happen after the libraft build is complete and doesn't have to wait for publish, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point!!!
I'll change that here, and put up PRs across RAPIDS with changes like that.
Across RAPIDS, we have wheel-build-*
depending on wheel-publish-*
of its dependencies.
That extra parallelism should reduce the end-to-end time for publishing nightlies 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice thanks! I'll let you handle doing this elsewhere as time permits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No prob, no prob. I'm gonna wait until we've finished rapidsai/cugraph#4804, in case that uncovers other this-should-change-across-all-of-RAPIDS things, then do a sweep of 1 PR per repo with these changes where necessary.
I'm keeping a list, so far I have:
- nightly / branch builds depending on
*-build
not*-publish
- PR builds depending on
*-build
not*-tests
- removing
-DDETECT_CONDA_ENV=OFF
from wheel-building scripts - enforcing
rapids-cmake
's preferred CMake formatting (introduce libraft wheels #2531 (comment))
…-building scripts, add new deps to update-version.sh
/ok to test |
https://git-scm.com/docs/git-rebase#Documentation/git-rebase.txt---signoff |
Approved additional changes since my last review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of threads unresolved that require some follow-up work from you, but otherwise this PR LGTM now pending the devcontainer updates.
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
# To use a different ucxx locally, set the CMake variable | ||
# CPM_ucxx_SOURCE=/path/to/local/ucxx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for fixing - looks like a cut and paste error
Pulling just the RAFT-related changes off of #438 Complements rapidsai/raft#2531 ## Notes for Reviewers Proposing: 1. see this pass CI pointed at my fork (branch from rapidsai/raft#2531) 2. admin-merge rapidsai/raft#2531 3. point this back at upstream `rapidsai/raft` and see it pass CI 4. merge --------- Co-authored-by: Bradley Dice <[email protected]>
Replaces #4340, contributes to rapidsai/build-planning#33. Proposes packaging `libcugraph` as a wheel, which is then re-used by `cugraph-cu{11,12}` and `pylibcugraph-cu{11,12}` wheels. ## Notes for Reviewers ### Benefits of these changes * smaller wheels (see "Size Changes" below) - *no more `pylibcugraph` and `cugraph` both holding copies of libcugraph.so* * faster compile times - *no more re-compiling RAFT, thanks to rapidsai/raft#2531 - *no more recompiling libcugraph.so in both `pylibcugraph` and `cugraph` wheel builds* * other benefits mentioned in rapidsai/build-planning#33 ### Wheel contents `libcugraph`: * `libcugraph.so` (shared library) * cuGraph headers * vendored dependencies (`fmt`, `spdlog`, CCCL, `cuco`) `pylibcugraph`: * `pylibcugraph` Python / Cython code and compiled Cython extensions `cugraph`: * `cugraph` Python / Cython code and compiled Cython extension ### Dependency Flows In short.... `libcugraph` contains `libcugraph.so` and `libcugraph_c.so` dynamic libraries and the headers to link against it. * Anything that needs to link against cuGraph at build time pulls in `libcugraph` wheels as a build dependency. * Anything that needs cuGraph's symbols at runtime pulls it in as a runtime dependency, and calls `libcugraph.load_library()`. For more details and some flowcharts, see rapidsai/build-planning#33 (comment) ### Size changes (CUDA 12, Python 3.12, x86_64) | wheel | num files (before) | num files (this PR) | size (before) | size (this PR) | |:---------------:|------------------:|-----------------:|--------------:|-------------:| | `libcugraph` | --- | 1762 | --- | 903M | | `pylibcugraph` | 190 | 187 | 901M | 2M | | `cugraph` | 315 | 313 | 899M | 3M | |**TOTAL** | **505** | **2,262** | **1,800M** | **908M** | *NOTES: size = compressed, "before" = 2025-01-13 nightlies* *This is a cuGraph-specific slice of the table from rapidsai/raft#2531. See that PR for details.* ### How I tested this These other PRs: * rapidsai/devcontainers#435 * rapidsai/cugraph-gnn#110 Authors: - James Lamb (https://github.com/jameslamb) - Ralph Liu (https://github.com/nv-rliu) - Bradley Dice (https://github.com/bdice) Approvers: - Brad Rees (https://github.com/BradReesWork) - Bradley Dice (https://github.com/bdice) URL: #4804
Contributes to rapidsai/build-planning#137 Follow-up to #2531 . See the linked issue for many more details, but in short... using a dynamically-loaded libraft which has statically-linked cuBLAS causes issues for other libraries. There are now aarch64 CUDA 11 wheels for cuBLAS and other CUDA libraries, so it's possible to have RAFT wheels dynamically link against them. This PR does that. ## Notes for Reviewers This has other side benefits in addition to fixing runtime issues... it also simplifies the wheel-building scripts and CMake, and makes CUDA 11 wheels noticeably smaller 😊 Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #2548
Replaces #2306, contributes to rapidsai/build-planning#33.
Proposes packaging
libraft
as a wheel, which is then re-used by:pylibraft-cu{11,12}
andraft-cu{11,12}
(this PR)libcugraph-cu{11,12}
,pylibcugraph-cu{11,12}
, andcugraph-cu{11,12}
in introduce libcugraph wheels cugraph#4804libcuml-cu{11,12}
andcuml-cu{11,12}
in WIP: [DO NOT MERGE] introduce libcuml wheels cuml#6199As part of this, also proposes:
RAFT_COMPILE_DYNAMIC_ONLY
, to allow building/installing only the dynamic shared library (i.e. skipping the static library)rapids-cmake
's preferred CMake style (introduce libraft wheels #2531 (comment))*-publish
(to reduce end-to-end CI time)Notes for Reviewers
Benefits of these changes
Wheel contents
libraft
:libraft.so
(shared library)fmt
, CCCL,cuco
,cute
,cutlass
)pylibraft
:pylibraft
Python / Cython code and compiled Cython extensionsraft-dask
:raft-dask
Python / Cython code and compiled Cython extensionDependency Flows
In short....
libraft
contains alibraft.so
dynamic library and the headers to link against it.libraft
wheels as a build dependency.libraft.load_library()
.For more details and some flowcharts, see rapidsai/build-planning#33 (comment)
Size changes (CUDA 12, Python 3.12, x86_64)
libraft
.pylibraft
raft-dask
libcugraph
pylibcugraph
cugraph
libcuml
cuml
NOTES: size = compressed, "before" = 2025-01-13 nightlies
how I calculated those (click me)
cugraph
: nightly commit = rapidsai/cugraph@8507cbf, PR = introduce libcugraph wheels cugraph#4804cuml
: nightly commit = rapidsai/cuml@7c715c4, PR = WIP: [DO NOT MERGE] introduce libcuml wheels cuml#6199raft
: nightly commit = 1b62c41, PR = this PRHow I tested this
These other PRs: