Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement eigenvector centrality #2287

Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ __pycache__
.lock
*.swp
*.pytest_cache
*~
DartConfiguration.tcl
.DS_Store

Expand Down
2 changes: 2 additions & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,8 @@ add_library(cugraph
src/link_analysis/pagerank_mg.cu
src/centrality/katz_centrality_sg.cu
src/centrality/katz_centrality_mg.cu
src/centrality/eigenvector_centrality_sg.cu
src/centrality/eigenvector_centrality_mg.cu
src/serialization/serializer.cu
src/tree/mst.cu
src/components/weakly_connected_components_sg.cu
Expand Down
32 changes: 32 additions & 0 deletions cpp/include/cugraph/algorithms.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1221,6 +1221,38 @@ void pagerank(raft::handle_t const& handle,
bool has_initial_guess = false,
bool do_expensive_check = false);

/**
* @brief Compute Eigenvector Centrality scores.
*
* This function computes eigenvector centrality scores using the power method.
*
* @throws cugraph::logic_error on erroneous input arguments or if fails to converge before @p
* max_iterations.
*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weights. Needs to be a floating point type.
* @tparam multi_gpu Flag indicating whether template instantiation should target single-GPU (false)
* or multi-GPU (true).
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* @param graph_view Graph view object.
* @param centralities Device span where we should store the eigenvector centralities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass initial values?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add that support. Missed that.

* @param epsilon Error tolerance to check convergence. Convergence is assumed if the sum of the
* differences in eigenvector centrality values between two consecutive iterations is less than the
* number of vertices in the graph multiplied by @p epsilon.
* @param max_iterations Maximum number of power iterations.
* @param do_expensive_check A flag to run expensive checks for input arguments (if set to `true`).
*/
template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu>
void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<vertex_t, edge_t, weight_t, true, multi_gpu> const& graph_view,
raft::device_span<weight_t> centralities,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the sake of discussion,

So, what do you think about passing raft::device_span<weight_t> centralities as an input argument vs returning rmm::device_uvector<weight_t> holding centrality values?

The former might be more natural when we're passing initial values and we may be able to reduce memory allocations (when we are running PageRank with different personalization vectors, but with the rmm pool allocator, memory allocation overhead might be insignificant) while the latter might be more functional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the idea of using the span from looking at your new triangle_count implementation. The [in/out] of centralities is more consistent with what we have been doing. Our paradigm thus far has been to specify the output storage a priori if we can know it, and to allocate it dynamically if we can't know it.

What you are suggesting would be a paradigm shift for the API. I'm not opposed to changing the paradigm.

It seems to me the current paradigm has the following advantages:

  • Less memory allocation. The new strategy would require temporarily having an extra vector of length V.
  • The caller can use any memory allocator that they choose to allocate the device memory

The new paradigm would have the following advantages:

  • More functional in nature
  • More consistency (all algorithms would return results the same way, whether the size is predictable or not)

In the grand scheme of memory things, I'm not all that concerned over allocating an extra result array temporarily. It seems to me that the functional feel of the proposed paradigm is useful and consistency in how algorithms behave across the interface is always better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I can certainly change raft::device_span<weight_t> centralities to std::optional< raft::device_span<weight_t>> centralities to support an optional input, and make the return value rmm::device_uvector<weight_t>

weight_t epsilon,
size_t max_iterations = 500,
bool do_expensive_check = false);

/**
* @brief Compute HITS scores.
*
Expand Down
13 changes: 0 additions & 13 deletions cpp/include/cugraph/detail/utility_wrappers.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,19 +45,6 @@ void uniform_random_fill(rmm::cuda_stream_view const& stream_view,
value_t max_value,
uint64_t seed);

/**
* @brief Normalize the values in an array
*
* @tparam value_t type of the value to operate on
*
* @param[in] stream_view stream view
* @param[out] d_value device array to reduce
* @param[in] size number of elements in array
*
*/
template <typename value_t>
void normalize(rmm::cuda_stream_view const& stream_view, value_t* d_value, size_t size);

/**
* @brief Fill a buffer with a sequence of values
*
Expand Down
11 changes: 2 additions & 9 deletions cpp/src/c_api/eigenvector_centrality.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,19 +81,12 @@ struct eigenvector_centrality_functor : public cugraph::c_api::abstract_functor
rmm::device_uvector<weight_t> centralities(graph_view.local_vertex_partition_range_size(),
handle_.get_stream());

// FIXME: For now we'll call pagerank which returns a similarly formatted thing
cugraph::pagerank<vertex_t, edge_t, weight_t, weight_t, multi_gpu>(
cugraph::eigenvector_centrality<vertex_t, edge_t, weight_t, multi_gpu>(
handle_,
graph_view,
std::nullopt,
std::nullopt,
std::nullopt,
std::nullopt,
centralities.data(),
weight_t{0.95},
raft::device_span<weight_t>{centralities.data(), centralities.size()},
static_cast<weight_t>(epsilon_),
max_iterations_,
false,
do_expensive_check_);

rmm::device_uvector<vertex_t> vertex_ids(graph_view.local_vertex_partition_range_size(),
Expand Down
155 changes: 155 additions & 0 deletions cpp/src/centrality/eigenvector_centrality_impl.cuh
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <cugraph/algorithms.hpp>
#include <cugraph/detail/utility_wrappers.hpp>
#include <cugraph/graph_view.hpp>
#include <cugraph/prims/count_if_e.cuh>
#include <cugraph/prims/count_if_v.cuh>
#include <cugraph/prims/edge_partition_src_dst_property.cuh>
#include <cugraph/prims/per_v_transform_reduce_incoming_outgoing_e.cuh>
#include <cugraph/prims/reduce_v.cuh>
#include <cugraph/prims/transform_reduce_v.cuh>
#include <cugraph/prims/update_edge_partition_src_dst_property.cuh>
#include <cugraph/utilities/error.hpp>

#include <raft/handle.hpp>
#include <rmm/exec_policy.hpp>

#include <thrust/fill.h>
#include <thrust/for_each.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, copy/paste. I'll check all the headers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to delete this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

#include <thrust/iterator/constant_iterator.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/transform.h>
#include <thrust/tuple.h>

namespace cugraph {
namespace detail {

template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu>
void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<vertex_t, edge_t, weight_t, true, multi_gpu> const& pull_graph_view,
raft::device_span<weight_t> centralities,
weight_t epsilon,
size_t max_iterations,
bool do_expensive_check)
{
using GraphViewType = graph_view_t<vertex_t, edge_t, weight_t, true, multi_gpu>;
auto const num_vertices = pull_graph_view.number_of_vertices();
if (num_vertices == 0) { return; }

if (do_expensive_check) {
if (pull_graph_view.is_weighted()) {
auto num_nonpositive_edge_weights =
count_if_e(handle,
pull_graph_view,
dummy_property_t<vertex_t>{}.device_view(),
dummy_property_t<vertex_t>{}.device_view(),
[] __device__(vertex_t, vertex_t, weight_t w, auto, auto) { return w <= 0.0; });
CUGRAPH_EXPECTS(num_nonpositive_edge_weights == 0,
"Invalid input argument: input graph should have postive edge weights.");
}
}

thrust::fill(handle.get_thrust_policy(),
centralities.begin(),
centralities.end(),
weight_t{1.0} / static_cast<weight_t>(num_vertices));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetworkX supports passing initial values (https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.centrality.eigenvector_centrality.html). Shouldn't we support the same (we support initial values for PageRank).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add, missed that.


// Power iteration
rmm::device_uvector<weight_t> old_centralities(centralities.size(), handle.get_stream());

edge_partition_src_property_t<GraphViewType, weight_t> edge_partition_src_centralities(
handle, pull_graph_view);

size_t iter{0};
while (true) {
thrust::copy(handle.get_thrust_policy(),
centralities.begin(),
centralities.end(),
old_centralities.data());

update_edge_partition_src_property(
handle, pull_graph_view, centralities.begin(), edge_partition_src_centralities);

per_v_transform_reduce_incoming_e(
handle,
pull_graph_view,
edge_partition_src_centralities.device_view(),
dummy_property_t<vertex_t>{}.device_view(),
[] __device__(vertex_t, vertex_t, weight_t w, auto src_val, auto) { return src_val * w; },
weight_t{0},
centralities.begin());

// Normalize the centralities
auto hypotenuse = sqrt(transform_reduce_v(
handle,
pull_graph_view,
centralities.begin(),
[] __device__(auto, auto val) { return val * val; },
weight_t{0.0}));

thrust::transform(handle.get_thrust_policy(),
centralities.begin(),
centralities.end(),
centralities.begin(),
[hypotenuse] __device__(auto val) { return val / hypotenuse; });

auto diff_sum = transform_reduce_v(
handle,
pull_graph_view,
thrust::make_zip_iterator(thrust::make_tuple(centralities.begin(), old_centralities.data())),
[] __device__(auto, auto val) { return std::abs(thrust::get<0>(val) - thrust::get<1>(val)); },
weight_t{0.0});

iter++;

if (diff_sum < (pull_graph_view.number_of_vertices() * epsilon)) {
break;
} else if (iter >= max_iterations) {
CUGRAPH_FAIL("Eigenvector Centrality failed to converge.");
}
}
}

} // namespace detail

template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu>
void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<vertex_t, edge_t, weight_t, true, multi_gpu> const& graph_view,
raft::device_span<weight_t> centralities,
weight_t epsilon,
size_t max_iterations,
bool do_expensive_check)
{
static_assert(std::is_integral<vertex_t>::value,
"GraphViewType::vertex_type should be integral.");
static_assert(std::is_floating_point<weight_t>::value,
"weight_t should be a floating-point type.");

CUGRAPH_EXPECTS(epsilon >= 0.0, "Invalid input argument: epsilon should be non-negative.");
CUGRAPH_EXPECTS(
centralities.size() == static_cast<size_t>(graph_view.local_vertex_partition_range_size()),
"Centralities should be same size as vertex range");

detail::eigenvector_centrality(
handle, graph_view, centralities, epsilon, max_iterations, do_expensive_check);
}

} // namespace cugraph
69 changes: 69 additions & 0 deletions cpp/src/centrality/eigenvector_centrality_mg.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <centrality/eigenvector_centrality_impl.cuh>

namespace cugraph {

// MG instantiation
template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int32_t, int32_t, float, true, true> const& graph_view,
raft::device_span<float> centralities,
float epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int32_t, int64_t, float, true, true> const& graph_view,
raft::device_span<float> centralities,
float epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int64_t, int64_t, float, true, true> const& graph_view,
raft::device_span<float> centralities,
float epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int32_t, int32_t, double, true, true> const& graph_view,
raft::device_span<double> centralities,
double epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int32_t, int64_t, double, true, true> const& graph_view,
raft::device_span<double> centralities,
double epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int64_t, int64_t, double, true, true> const& graph_view,
raft::device_span<double> centralities,
double epsilon,
size_t max_iterations,
bool do_expensive_check);

} // namespace cugraph
69 changes: 69 additions & 0 deletions cpp/src/centrality/eigenvector_centrality_sg.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <centrality/eigenvector_centrality_impl.cuh>

namespace cugraph {

// SG instantiation
template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int32_t, int32_t, float, true, false> const& graph_view,
raft::device_span<float> centralities,
float epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int32_t, int64_t, float, true, false> const& graph_view,
raft::device_span<float> centralities,
float epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int64_t, int64_t, float, true, false> const& graph_view,
raft::device_span<float> centralities,
float epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int32_t, int32_t, double, true, false> const& graph_view,
raft::device_span<double> centralities,
double epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int32_t, int64_t, double, true, false> const& graph_view,
raft::device_span<double> centralities,
double epsilon,
size_t max_iterations,
bool do_expensive_check);

template void eigenvector_centrality(
raft::handle_t const& handle,
graph_view_t<int64_t, int64_t, double, true, false> const& graph_view,
raft::device_span<double> centralities,
double epsilon,
size_t max_iterations,
bool do_expensive_check);

} // namespace cugraph
Loading