-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize K-Truss #4375
Optimize K-Truss #4375
Conversation
…6_optimize-k-truss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much simpler. A few cleanup comments, otherwise I think it's looking pretty good.
@@ -136,7 +136,7 @@ edge_property_t<graph_view_t<vertex_t, edge_t, false, multi_gpu>, edge_t> edge_t | |||
auto edge_first = thrust::make_zip_iterator(edgelist_srcs.begin(), edgelist_dsts.begin()); | |||
|
|||
size_t edges_to_intersect_per_iteration = | |||
static_cast<size_t>(handle.get_device_properties().multiProcessorCount) * (1 << 17); | |||
static_cast<size_t>(handle.get_device_properties().multiProcessorCount) * (1 << 13); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for 13? Should this be some sort of named constant so it's easier to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good idea to expose it in order to easily control the chunk size. I temporarily set a lower value so that I can process a larger graph when running at scale
@@ -0,0 +1,41 @@ | |||
/* | |||
* Copyright (c) 2023-2024, NVIDIA CORPORATION. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just 2024
@@ -0,0 +1,41 @@ | |||
/* | |||
* Copyright (c) 2023-2024, NVIDIA CORPORATION. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just 2024
cpp/src/community/k_truss_impl.cuh
Outdated
|
||
std::tie(edgelist_srcs, edgelist_dsts, edgelist_wgts, num_triangles, std::ignore) = | ||
decompress_to_edgelist( | ||
std::chrono::seconds s (0); // 1 second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two thoughts...
- Have you looked at https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/include/cugraph/utilities/high_res_timer.hpp which I think you could just include and use rather than reinventing. There are features to manage multiple timers inside that as well
- Probably should remove these before we merge the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only wanted to time the triangle count part and I got a compiler error when copying the high_res_timer
timing code from our tests so I didn't spend much time understanding what was wrong. But I will look into it again.
Probably should remove these before we merge the PR
I only have them for benchmarking on Draco. I will remove them in my next commit today along with all the unused functors
struct KTruss_Usecase { | ||
int32_t k_{3}; | ||
bool test_weighted_{false}; | ||
// FIXME: test edge mask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we are skipping this? (edge mask)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I am not. I just forgot to remove this outdated comment
mg_graph_view.vertex_partition_range_lasts()); | ||
|
||
auto global_d_cugraph_srcs = cugraph::test::device_gatherv( | ||
*handle_, raft::device_span<vertex_t const>(d_cugraph_srcs.data(), d_cugraph_srcs.size())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for additional nitpicking, but it might be better to follow naming convention of K-core tests (https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/tests/cores/mg_k_core_test.cpp#L136). Later, if this pattern occurs frequent enough, we can create another utility funciton.
@@ -497,7 +132,7 @@ k_truss(raft::handle_t const& handle, | |||
exclude_self_loop_t<vertex_t>{}); | |||
|
|||
if constexpr (multi_gpu) { | |||
std::tie(srcs, dsts, std::ignore, std::ignore, std::ignore) = | |||
std::tie(srcs, dsts, std::ignore, std::ignore, std::ignore, std::ignore) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can defer this to the next MG optimization PR, but you can use masking here instead of creating a new graph for excluding self-loops (see the example here. https://github.com/rapidsai/cugraph/blob/branch-24.08/cpp/src/community/triangle_count_impl.cuh#L351).
cpp/src/community/k_truss_impl.cuh
Outdated
@@ -656,265 +291,53 @@ k_truss(raft::handle_t const& handle, | |||
*vertex_partition_range_lasts); | |||
} | |||
renumber_map = std::move(tmp_renumber_map); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use masking instead of creating a graph except for the final DODG graph creation (this DODG graph will be used again & again so better create a new graph than using masking as masking has its own overhead).
while (true) { | ||
|
||
auto edge_triangle_counts = | ||
edge_triangle_count<vertex_t, edge_t, multi_gpu>(handle, cur_graph_view); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a FIXME statement here.
This approach will miserably fail if we need to go through many iterations and only a small number of edges are invalidated in each iteration. No need to address this if this doesn't happen for any practical graphs but if we find one, we can work on optimization focusing on that graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…6_optimize-k-truss
d96081c
to
3743208
Compare
/merge |
This PR leverages our existing edge triangle count to implement both SG and MG K-Truss as our initial version.
This PR also:
rx count
in several as our shuffling functionsCloses #4500