Skip to content

Commit

Permalink
ENH Refactored cython graph factory code to scale to additional data …
Browse files Browse the repository at this point in the history
…types (rapidsai#1178)

* Minor update to comment to describe array sizes.

* Changed graph container to use smart pointers, added arg for instantiating legacy types and switch statements for it to factory function.

* Added PR 1152 to CHANGELOG.md

* Removing unnecessary .get() call on unique_ptr instance

* Using make_unique() instead of new

* Updated to call drop() correctly after cudf API update.

* Added args to support calling get_vertex_identifiers().

* Style fixes, removed commented out code meant for a future change.

* Updated comment with description of new 'identifiers' arg.

* Safety commit, still WIP, does not compile - updates for 2D graph support and upcoming 2D shuffle support

* safety commit, does not pass tests: updated enough to be able to run the MG Louvain test.

* Updated call_louvain() to use the new graph_t types. Still WIP, needs louvain updates to compile.

* WIP: updates for incorporating new 2D shuffle data, still does not pass test.

* Adding updates from iroy30 for calling shuffle from louvain.py

* Updated to extract and pass the partition_t info and call the graph_t ctor. Now having a problem finding the right subcommunicator.

* Updates to set up subcomms - having a problem with something needed by subcomms not being initialized: "address not mapped to object at address (nil)"

* Added p2p flag to comms initialize() to enable initialization of UCX endpoints needed for MG test.

* some proposed cleanup

* safety commit: committing with debug prints to allow other team members to debug in parallel.

* new technique for factory

* safety commit: more updates to address problems instantiating graph_t (using num edges for partition instead of global for edgelist) and for debugging (print statments).

* Changing how row and col rank are obtained, added debug prints for edge lists info

* Fixes to partition_t get_matrix_partition_major/minor methods based on feedback.

* Update shuffle.py

* Integrating changes from iroy30 to produce "option 1" shuffle output by default, with an option to enable "option 2", temporarily enabled graph expensive checks for debugging.

* Addressed review feedback: made var names consistent, fixed weights=None bug in cython code, added copyright to shuffle.py, changed how ranks are retrieved from the raft handle.

* Removed debug prints.

* Added PR 1163 to CHANGELOG.md

* Removed extra newlines accidentally added to clean up diff in the PR, updated comment in cython code.

* Added specific newlines back so file does not differ unnecessarily.

* Disabled graph_t expensive check that was left enabled for debugging.

* Added code path in call_louvain to support legacy graph types, to be removed when migration to graph_t types is complete.

* Updates based on feedback from PR 1163: code cleanup/removed unused union members, consolidated legacy enum types, updated comments, initial support added for 64-bit vertex types (untested)

* plumbed bool set based on running renumbering to set sorted_by_degree flag in graph container.

* Added PR 1178 to CHANGELOG.md, C++ style fixes.

* Addressed PR review feedback: added support for proper edge_t in cython wrapper and removed unnecessary vertex_t/edge_t int64,int32 combinations.

Co-authored-by: Rick Ratzel <[email protected]>
Co-authored-by: Chuck Hastings <[email protected]>
Co-authored-by: Iroy30 <[email protected]>
  • Loading branch information
4 people authored Oct 2, 2020
1 parent fdfa584 commit 60b9b85
Show file tree
Hide file tree
Showing 12 changed files with 644 additions and 429 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
- PR #1157 Louvain API update to use graph_container_t
- PR #1151 MNMG extension for pattern accelerator based PageRank, Katz Centrality, BFS, and SSSP implementations (C++ part)
- PR #1163 Integrated 2D shuffling and Louvain updates
- PR #1178 Refactored cython graph factory code to scale to additional data types

## Improvements
- PR 1081 MNMG Renumbering - sort partitions by degree
Expand Down
2 changes: 1 addition & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,7 @@ add_library(cugraph SHARED
src/db/db_parser_integration_test.cu
src/db/db_operators.cu
src/utilities/spmv_1D.cu
src/utilities/cython.cpp
src/utilities/cython.cu
src/structure/graph.cu
src/link_analysis/pagerank.cu
src/link_analysis/pagerank_1D.cu
Expand Down
97 changes: 55 additions & 42 deletions cpp/include/utilities/cython.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,65 +22,55 @@
namespace cugraph {
namespace cython {

enum class numberTypeEnum : int { intType, floatType, doubleType };
enum class numberTypeEnum : int { int32Type, int64Type, floatType, doubleType };

// FIXME: The GraphC??View* types will not be used in the near future. Those are
// left in place as cython wrappers transition from the GraphC* classes to
// graph_* classes. Remove GraphC* classes once the transition is complete.
enum class graphTypeEnum : int {
// represents unintiialized or NULL ptr
null,
// represents some legacy Cxx type. This and other LegacyCxx values are not
// used for the unique_ptr in a graph_container_t, but instead for when this
// enum is used for determining high-level code paths to take to prevent
// needing to expose each legacy enum value to cython.
LegacyCSR,
LegacyCSC,
LegacyCOO,
// represents that a GraphCxxView* unique_ptr type is present in a
// graph_container_t.
GraphCSRViewFloat,
GraphCSRViewDouble,
GraphCSCViewFloat,
GraphCSCViewDouble,
GraphCOOViewFloat,
GraphCOOViewDouble,
graph_t_float,
graph_t_double,
graph_t_float_mg,
graph_t_double_mg,
graph_t_float_transposed,
graph_t_double_transposed,
graph_t_float_mg_transposed,
graph_t_double_mg_transposed
// represents values present in the graph_container_t to construct a graph_t,
// but unlike legacy classes does not mean a graph_t unique_ptr is present in
// the container.
graph_t,
};

// Enum for the high-level type of GraphC??View* class to instantiate.
enum class legacyGraphTypeEnum : int { CSR, CSC, COO };

// "container" for a graph type instance which insulates the owner from the
// specifics of the actual graph type. This is intended to be used in Cython
// code that only needs to pass a graph object to another wrapped C++ API. This
// greatly simplifies the Cython code since the Cython definition only needs to
// define the container and not the various individual graph types in Cython.
struct graph_container_t {
// FIXME: use std::variant (or a better alternative, ie. type erasure?) instead
// of a union if possible
// FIXME: This union is in place only to support legacy calls, remove when
// migration to graph_t types is complete, or when legacy graph objects are
// constructed in the call_<<algo> wrappers instead of the
// populate_graph_container_legacy() function.
union graphPtrUnion {
~graphPtrUnion() {}

void* null;
std::unique_ptr<GraphCSRView<int, int, float>> GraphCSRViewFloatPtr;
std::unique_ptr<GraphCSRView<int, int, double>> GraphCSRViewDoublePtr;
std::unique_ptr<GraphCSCView<int, int, float>> GraphCSCViewFloatPtr;
std::unique_ptr<GraphCSCView<int, int, double>> GraphCSCViewDoublePtr;
std::unique_ptr<GraphCOOView<int, int, float>> GraphCOOViewFloatPtr;
std::unique_ptr<GraphCOOView<int, int, double>> GraphCOOViewDoublePtr;
std::unique_ptr<experimental::graph_t<int, int, float, false, false>> graph_t_float_ptr;
std::unique_ptr<experimental::graph_t<int, int, double, false, false>> graph_t_double_ptr;
std::unique_ptr<experimental::graph_t<int, int, float, false, true>> graph_t_float_mg_ptr;
std::unique_ptr<experimental::graph_t<int, int, double, false, true>> graph_t_double_mg_ptr;
std::unique_ptr<experimental::graph_t<int, int, float, true, false>>
graph_t_float_transposed_ptr;
std::unique_ptr<experimental::graph_t<int, int, double, true, false>>
graph_t_double_transposed_ptr;
std::unique_ptr<experimental::graph_t<int, int, float, true, true>>
graph_t_float_mg_transposed_ptr;
std::unique_ptr<experimental::graph_t<int, int, double, true, true>>
graph_t_double_mg_transposed_ptr;
std::unique_ptr<GraphCSRView<int32_t, int32_t, float>> GraphCSRViewFloatPtr;
std::unique_ptr<GraphCSRView<int32_t, int32_t, double>> GraphCSRViewDoublePtr;
std::unique_ptr<GraphCSCView<int32_t, int32_t, float>> GraphCSCViewFloatPtr;
std::unique_ptr<GraphCSCView<int32_t, int32_t, double>> GraphCSCViewDoublePtr;
std::unique_ptr<GraphCOOView<int32_t, int32_t, float>> GraphCOOViewFloatPtr;
std::unique_ptr<GraphCOOView<int32_t, int32_t, double>> GraphCOOViewDoublePtr;
};

graph_container_t() : graph_ptr_union{nullptr}, graph_ptr_type{graphTypeEnum::null} {}
graph_container_t() : graph_ptr_union{nullptr}, graph_type{graphTypeEnum::null} {}
~graph_container_t() {}

// The expected usage of a graph_container_t is for it to be created as part
Expand All @@ -93,7 +83,30 @@ struct graph_container_t {
graph_container_t& operator=(const graph_container_t&) = delete;

graphPtrUnion graph_ptr_union;
graphTypeEnum graph_ptr_type;
graphTypeEnum graph_type;

// primitive data used for constructing graph_t instances.
void* src_vertices;
void* dst_vertices;
void* weights;
void* vertex_partition_offsets;

size_t num_partition_edges;
size_t num_global_vertices;
size_t num_global_edges;
numberTypeEnum vertexType;
numberTypeEnum edgeType;
numberTypeEnum weightType;
bool transposed;
bool is_multi_gpu;
bool sorted_by_degree;
bool do_expensive_check;
bool hypergraph_partitioned;
int row_comm_size;
int col_comm_size;
int row_comm_rank;
int col_comm_rank;
experimental::graph_properties_t graph_props;
};

// FIXME: finish description for vertex_partition_offsets
Expand All @@ -107,7 +120,7 @@ struct graph_container_t {
// container (ie. a container that has not been previously populated by
// populate_graph_container())
//
// legacyGraphTypeEnum legacyType
// graphTypeEnum legacyType
// Specifies the type of graph when instantiating a legacy graph type
// (GraphCSRViewFloat, etc.).
// NOTE: this parameter will be removed when the transition to exclusinve use
Expand Down Expand Up @@ -144,8 +157,6 @@ struct graph_container_t {
// bool multi_gpu
// true if the resulting graph object is to be used for a multi-gpu
// application
//
// FIXME: Should local_* values be void* as well?
void populate_graph_container(graph_container_t& graph_container,
raft::handle_t& handle,
void* src_vertices,
Expand All @@ -155,17 +166,19 @@ void populate_graph_container(graph_container_t& graph_container,
numberTypeEnum vertexType,
numberTypeEnum edgeType,
numberTypeEnum weightType,
int num_partition_edges,
size_t num_partition_edges,
size_t num_global_vertices,
size_t num_global_edges,
size_t row_comm_size, // pcols
size_t col_comm_size, // prows
bool sorted_by_degree,
bool transposed,
bool multi_gpu);

// FIXME: comment this function
// FIXME: Should local_* values be void* as well?
void populate_graph_container_legacy(graph_container_t& graph_container,
legacyGraphTypeEnum legacyType,
graphTypeEnum legacyType,
raft::handle_t const& handle,
void* offsets,
void* indices,
Expand Down
24 changes: 24 additions & 0 deletions cpp/src/community/louvain.cu
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,18 @@ template std::pair<size_t, double> louvain(
int32_t *,
size_t,
double);
template std::pair<size_t, float> louvain(
raft::handle_t const &,
experimental::graph_view_t<int64_t, int32_t, float, false, false> const &,
int64_t *,
size_t,
float);
template std::pair<size_t, double> louvain(
raft::handle_t const &,
experimental::graph_view_t<int64_t, int32_t, double, false, false> const &,
int64_t *,
size_t,
double);
template std::pair<size_t, float> louvain(
raft::handle_t const &,
experimental::graph_view_t<int64_t, int64_t, float, false, false> const &,
Expand Down Expand Up @@ -135,6 +147,18 @@ template std::pair<size_t, double> louvain(
int32_t *,
size_t,
double);
template std::pair<size_t, float> louvain(
raft::handle_t const &,
experimental::graph_view_t<int64_t, int32_t, float, false, true> const &,
int64_t *,
size_t,
float);
template std::pair<size_t, double> louvain(
raft::handle_t const &,
experimental::graph_view_t<int64_t, int32_t, double, false, true> const &,
int64_t *,
size_t,
double);
template std::pair<size_t, float> louvain(
raft::handle_t const &,
experimental::graph_view_t<int64_t, int64_t, float, false, true> const &,
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/experimental/graph.cu
Original file line number Diff line number Diff line change
Expand Up @@ -522,7 +522,7 @@ template class graph_t<int64_t, int64_t, float, true, true>;
template class graph_t<int64_t, int64_t, float, false, true>;
template class graph_t<int64_t, int64_t, double, true, true>;
template class graph_t<int64_t, int64_t, double, false, true>;

//
template class graph_t<int32_t, int32_t, float, true, false>;
template class graph_t<int32_t, int32_t, float, false, false>;
template class graph_t<int32_t, int32_t, double, true, false>;
Expand Down
8 changes: 8 additions & 0 deletions cpp/src/experimental/graph_view.cu
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,10 @@ template class graph_view_t<int64_t, int64_t, float, true, true>;
template class graph_view_t<int64_t, int64_t, float, false, true>;
template class graph_view_t<int64_t, int64_t, double, true, true>;
template class graph_view_t<int64_t, int64_t, double, false, true>;
template class graph_view_t<int64_t, int32_t, float, true, true>;
template class graph_view_t<int64_t, int32_t, float, false, true>;
template class graph_view_t<int64_t, int32_t, double, true, true>;
template class graph_view_t<int64_t, int32_t, double, false, true>;

template class graph_view_t<int32_t, int32_t, float, true, false>;
template class graph_view_t<int32_t, int32_t, float, false, false>;
Expand All @@ -308,6 +312,10 @@ template class graph_view_t<int64_t, int64_t, float, true, false>;
template class graph_view_t<int64_t, int64_t, float, false, false>;
template class graph_view_t<int64_t, int64_t, double, true, false>;
template class graph_view_t<int64_t, int64_t, double, false, false>;
template class graph_view_t<int64_t, int32_t, float, true, false>;
template class graph_view_t<int64_t, int32_t, float, false, false>;
template class graph_view_t<int64_t, int32_t, double, true, false>;
template class graph_view_t<int64_t, int32_t, double, false, false>;

} // namespace experimental
} // namespace cugraph
Loading

0 comments on commit 60b9b85

Please sign in to comment.