-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forward-merge branch-24.12 to branch-25.02 #4785
Merged
raydouglass
merged 8 commits into
rapidsai:branch-25.02
from
bdice:branch-25.02-merge-24.12
Nov 26, 2024
Merged
Forward-merge branch-24.12 to branch-25.02 #4785
raydouglass
merged 8 commits into
rapidsai:branch-25.02
from
bdice:branch-25.02-merge-24.12
Nov 26, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Enables telemetry during cugraph's build process. This parses github job metadata to obtain timing information. It should have very little impact on overall build time, and should not interfere with any build tools. This implements emitting OpenTelemetry traces and spans, as described in rapidsai/build-infra#139 Authors: - Mike Sarahan (https://github.com/msarahan) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#4740
Increase max_iterations in MG HITS tests (with edge masking). With masking, we may end up with different graphs with different numbers of GPUs; this results in higher iteration counts for convergence for certain GPU counts. Increase the maximum iteration count to consider this. Authors: - Seunghwa Kang (https://github.com/seunghwak) Approvers: - Chuck Hastings (https://github.com/ChuckHastings) URL: rapidsai#4783
This PR includes multiple updates to cut peak memory usage in graph creation and improve performance of BFS on scale-free graphs. * Add a bitmap for non-zero local degree vertices in the hypersparse region; this information can be used to quickly filter out locally zero degree vertices which don't need to be processed in multiple instances. * Store (global-)degree offsets for vertices in the hypersparse region; this information can used to quickly identify the vertices with a certain global degree (e.g. for global degree 1 vertices, we can skip inter-GPU reduction as we know each vertex has only one neighbor). * Skip kernel invocations in computing edge counts if the vertex list is empty. * Add asynchronous functions to compute edge counts. This helps in preventing unnecessary serialization when we can process multiple such functions concurrently. * Replace rmm::exec_policy with rmm::exec_policy_nosync in multiple places; the former enforces stream synchronization at the end. The latter does not. * Enforce cache line alignment in NCCL communication in multiple places (NCCL communication performance is significantly affected by cache line alignment, often leading to 30-40% or more differences). * For primitives working on a subset of vertices, broadcast a vertex list using a bitmap if the vertex frontier size is large. If the vertex frontier size is small (in case vertex_t is 8B and the local vertex partition range can fit into 4B), use vertex offsets instead of vertices to cut communication volume. * Merge multiple host scalar communication function calls to a single one. * Increase multi-stream concurrency in detail::extract_transform_e & detail::per_v_transform_reduce_e * Multiple optimizations in template specialization (for update_major == true && reduce_op == any && key type is vertex && working on a subset of vertices) in detail::per_v_transform_reduce_e (this includes pre-processing vertices with non-zero local degrees; so we don't need to process such vertices using multiple GPUs, pre-filtering of zero local degree vertices, allreduce communication to reduce shuffle communication volumes, and special treatment of global degree 1 vertices, and so on). * Multiple optimizations & specializations in detail::fill_edge_minor_property that works on a subset of vertices (this includes kernel fusion, specialization for bitmap properties including direct broadcast to the property buffer and special treatments for vertex partition boundaries, and so on). * Added multiple optimizations & specializations in transform_reduce_v_frontier_outgoing_e (especially for reduce_op::any and to cut communication volumes and to filter out (key, value) pairs that won't contribute to the final results). * Multiple low-level optimizations in direction optimizing BFS (including approximations in determining between bottom -up and top-down). * Multiple optimizations to cut peak memory usage in graph creation. Authors: - Seunghwa Kang (https://github.com/seunghwak) Approvers: - Chuck Hastings (https://github.com/ChuckHastings) URL: rapidsai#4751
cugraph is no longer dependent on cugraph_ops and no longer pulls files from the cugrpahops repo. But we have two `assert` statements still assuming cugrpahops files are available. These assert statements are compiled only in the debug mode. This PR fixes build errors due to these assert statements in the debug build. Closes rapidsai#4763 Authors: - Seunghwa Kang (https://github.com/seunghwak) - Chuck Hastings (https://github.com/ChuckHastings) Approvers: - Chuck Hastings (https://github.com/ChuckHastings) - Joseph Nke (https://github.com/jnke2016) URL: rapidsai#4774
…e`, minor cleanup (rapidsai#4776) * updates READMEs to remove outdated nx-cugraph text * updates `core_number` docs, APIs, tests to properly ignore `degree_type` due to `core_number` not supporting directed graphs which `degree_type` is intended for - `degree_type` settings will be honored when directed graphs are supported. * renames test helper function for clarity * fixes issue with datasets API to properly recreate the edgelist for MG (dask) if previously created for SG. Authors: - Rick Ratzel (https://github.com/rlratzel) Approvers: - Don Acosta (https://github.com/acostadon) - Alex Barghi (https://github.com/alexbarghi-nv) - Brad Rees (https://github.com/BradReesWork) URL: rapidsai#4776
This updates docs. Code changes are here: rapidsai/nx-cugraph#27 See: rapidsai#4760 Authors: - Erik Welch (https://github.com/eriknw) Approvers: - Rick Ratzel (https://github.com/rlratzel) URL: rapidsai#4766
I noticed the `Traversals` table is not showing up in the [nx-cugraph docs](https://docs.rapids.ai/api/cugraph/stable/nx_cugraph/supported-algorithms/). `sphinx-lint` catches the underlying issue, and it also caught a couple other minor issues that this PR fixes. `sphinx-lint` is used by other notable repos such as `pandas` ([here](https://github.com/pandas-dev/pandas/blob/7fe270c8e7656c0c187260677b3b16a17a1281dc/.pre-commit-config.yaml#L92-L96)] and CPython ([here](https://github.com/python/cpython/blob/8fe1926164932f868e6e907ad72a74c2f2372b07/.pre-commit-config.yaml#L68-L73)) Authors: - Erik Welch (https://github.com/eriknw) Approvers: - Jake Awe (https://github.com/AyodeAwe) - Ralph Liu (https://github.com/nv-rliu) - Rick Ratzel (https://github.com/rlratzel) URL: rapidsai#4771
bdice
added
improvement
Improvement / enhancement to an existing function
non-breaking
Non-breaking change
labels
Nov 26, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CMake
cuGraph
improvement
Improvement / enhancement to an existing function
non-breaking
Non-breaking change
python
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Manual forward merge from 24.12 to 25.02. This PR should not be squashed.
Closes #4782.