-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Set index to _EDGE_ID_
and _VERTEX_
for _vertex_prop_dataframe
and _edge_prop_dataframe
to make sampling faster
#2401
Comments
I'm seeing 10x speedup in my tests by setting the index and using |
Wow, nice! Yup, I expect to start work on this today or Monday. |
Currently, this only does SG version for rapidsai#2401. MG is still TODO. This also doesn't change anything user-facing (yet).
This issue has been labeled |
~Currently, this only does SG version for #2401. MG is still TODO.~ Closes #2401 This also doesn't change anything user-facing (yet). Authors: - Erik Welch (https://github.com/eriknw) - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - Rick Ratzel (https://github.com/rlratzel) URL: #2523
Describe the solution you'd like and any additional context
We should set index to
_EDGE_ID_
and_VERTEX_
for_vertex_prop_dataframe
and_edge_prop_dataframe
so that when we are fetching for sampling byids
we are fast.Motivating Example where we see a 3x speed up for fetching a batch of
50k
.Without Index:
With Index (3x faster) :
The text was updated successfully, but these errors were encountered: