-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support Seed Retention for Sampling with Renumbering #4355
[FEA] Support Seed Retention for Sampling with Renumbering #4355
Conversation
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…to bug_post_processing
…i-nv/cugraph into support-seed-retention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Shouldn't we add C++ test for the added option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (C/C++ part)
This is not an added C++ option, it already existed. It was never exposed/used through the C API. I believe (didn't exhaustively look) that it is already tested in |
/merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just reviewed the python changes. I had one question and one optional suggestion, but neither should hold up approval.
label_offsets: integer sequence, optional (default=None) | ||
Offsets of each label within the start vertex list. | ||
Only used if retain_seeds is True. Required if retain_seeds | ||
is True. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion: maybe change this so that if this is specified, the retain_seeds=True
behavior is assumed and then drop the retain_seeds
arg completely? If you decide to do that, then be sure to also include the description of the retain_seeds
behavior in this description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a reasonable suggestion. I'll add that to the new PR.
@@ -342,13 +353,15 @@ def uniform_neighbor_sample( | |||
else None, | |||
h_fan_out=fanout_vals, | |||
with_replacement=with_replacement, | |||
do_expensive_check=False, | |||
do_expensive_check=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not intentional, whoops. We need to fix that.
45371cb
into
rapidsai:branch-24.06
…ample (#4421) #4355 accidentally added an expensive check to SG uniform neighbor sample. This PR reverts that addition. Authors: - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - Rick Ratzel (https://github.com/rlratzel) URL: #4421
Distributed sampling in cuGraph-PyG. Also renames the existing API to clarify that it is dask based. Adds a dependency on `tensordict` for `cuGraph-PyG` which supports the new `TensorDictFeatureStore`. Also no longer installs `torch-cluster` and `torch-spline-conv` in CI for testing since that results in an `ImportError` and neither of those packages are needed. Requires PyG 2.5. Should be merged after #4335 Merge after #4355 Closes #4248 Closes #4249 Closes #3383 Closes #3942 Closes #3836 Closes #4202 Closes #4051 Closes #4326 Closes #4252 Partially addresses #3805 Authors: - Alex Barghi (https://github.com/alexbarghi-nv) - Seunghwa Kang (https://github.com/seunghwak) - Tingyu Wang (https://github.com/tingyu66) - Ralph Liu (https://github.com/nv-rliu) Approvers: - Tingyu Wang (https://github.com/tingyu66) - Brad Rees (https://github.com/BradReesWork) - Jake Awe (https://github.com/AyodeAwe) URL: #4384
Exposes the ability to retain seeds even if they have no outgoing edges (and therefore are not sampled). Required to fix the current bug in cuGraph-PyG involving batch size and dropping seeds.
Currently, this functionality can't be exposed through the MG Python API (#4358) but exposing it through the pylibcugraph API is sufficient to resolve this issue. This PR does expose it through the SG Python API.