Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support heterogenous fanout type #4608

Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
153 commits
Select commit Hold shift + click to select a range
0adb2fd
support heterogenous fanout type
jnke2016 Aug 13, 2024
bb5a3e2
remove unusued code
jnke2016 Aug 13, 2024
10fa86d
fix style
jnke2016 Aug 13, 2024
f904350
create one API for both uniform and biased neighborhood sampling
jnke2016 Aug 20, 2024
1fc32c3
use the same function for both uniform and biased nieghborhood sampling
jnke2016 Aug 20, 2024
8fc21f8
add support for heterogenous fanout support at the plc layer and cons…
jnke2016 Aug 20, 2024
01a57f3
remove outdated codes
jnke2016 Aug 20, 2024
3a6aeb2
add flag differentiating between biased and uniform sampling
jnke2016 Aug 21, 2024
d2f6467
update docstrings and rename variable
jnke2016 Aug 21, 2024
5d25155
rename variable
jnke2016 Aug 21, 2024
80f8b86
create new tuple type
jnke2016 Aug 21, 2024
50e0fc5
remove unnecessary check
jnke2016 Aug 21, 2024
9f455bf
add constructor converting from array_view_t to array_t
jnke2016 Aug 21, 2024
d114534
leverage new constructor and remove unnecessary code
jnke2016 Aug 21, 2024
cf4a3ae
ensure edge types are ordered in increasing order
jnke2016 Aug 21, 2024
bc87b50
update docstrings
jnke2016 Aug 21, 2024
3013684
update docstrings
jnke2016 Aug 21, 2024
d6b6234
undo changes to uniform neighbor sample
jnke2016 Aug 22, 2024
068b0a3
undo changes to uniform neighbor sample
jnke2016 Aug 22, 2024
6920f65
update docstrings
jnke2016 Aug 22, 2024
760c5cd
re-order arguments
jnke2016 Aug 22, 2024
1e0ef27
remove outdated comments
jnke2016 Aug 22, 2024
de79620
add arguments and type check
jnke2016 Aug 23, 2024
8c17009
rename variable for consistency
jnke2016 Aug 23, 2024
7b95c5e
update neighbor sample API
jnke2016 Aug 30, 2024
19fc765
remove outdated code
jnke2016 Aug 30, 2024
e30766c
remove outdated comment
jnke2016 Aug 30, 2024
5dd66f2
first cut at new sampling function definition to clean up things befo…
ChuckHastings Sep 4, 2024
4b2764c
updates to remove builder pattern, also rename functions and mark old…
ChuckHastings Sep 5, 2024
4c1c610
add implementation of heterogeneous neighborhood sampling
jnke2016 Sep 9, 2024
fe35c80
add exit condition
jnke2016 Sep 9, 2024
a658b29
remove comments
jnke2016 Sep 10, 2024
e52a38a
Add Implementation
ChuckHastings Sep 11, 2024
c416439
call heterogeneous renumbering
jnke2016 Sep 13, 2024
98d6c57
update branch and call heterogneous renumbering
jnke2016 Sep 13, 2024
d7165af
update heterogeneous renumbering call
jnke2016 Sep 17, 2024
579fd0a
create a csr data structure to efficiently store vertex and label
jnke2016 Sep 17, 2024
5cdf40a
update API and docstring
jnke2016 Sep 17, 2024
a8fbd9d
remove unsued variable
jnke2016 Sep 17, 2024
9d5b3dd
update C++ API for neighbor sampling
jnke2016 Sep 20, 2024
0358c6e
add fixme for deprecated flags
jnke2016 Sep 20, 2024
799c35d
update CAPI
jnke2016 Sep 20, 2024
ab8aa72
undo changes to k-truss
jnke2016 Sep 21, 2024
7d8b5ad
undo changes to tests
jnke2016 Sep 21, 2024
f2190ba
clean up code
jnke2016 Sep 21, 2024
1e96dcf
update docs
jnke2016 Sep 23, 2024
36c25ad
fix typo
jnke2016 Sep 23, 2024
4857b36
call scatter instead of gather and fix type bug
jnke2016 Sep 23, 2024
263b6ac
fix typo
jnke2016 Sep 23, 2024
9dff3ab
update neighbor sample API
jnke2016 Sep 24, 2024
33c8b3d
update CAPI
jnke2016 Sep 25, 2024
e357f42
remove unsued code
jnke2016 Sep 25, 2024
6081978
remove outdated comment
jnke2016 Sep 25, 2024
73b3ffe
remove unnecessary copy
jnke2016 Sep 25, 2024
ea972f3
remove outdate arguments
jnke2016 Sep 26, 2024
8822192
fix typo
jnke2016 Sep 27, 2024
e02a513
update plc API of heterogeneous neighbor sample
jnke2016 Sep 27, 2024
d6cb1d5
fix typo
jnke2016 Sep 27, 2024
54fa155
change back the fanout type from a sparse to a dense structure
jnke2016 Sep 27, 2024
499e041
fix typo
jnke2016 Sep 27, 2024
b571deb
add implementation of heterogeneous/homogeneous biased/uniform neighb…
jnke2016 Sep 27, 2024
f6c4ce3
properly handle edge types
jnke2016 Sep 27, 2024
e71660d
add tests for 'homogeneous_uniform_neighbor_sampling'
jnke2016 Sep 27, 2024
4e2c8cf
add tests for homogeneous_biased_neighbor_sampling.cpp
jnke2016 Sep 27, 2024
2458149
update type combination
jnke2016 Sep 27, 2024
df3e4ff
add tests for heterogeneous uniform/biased neighborhood sampling
jnke2016 Sep 28, 2024
d4847e4
properly sample with edge types
jnke2016 Sep 28, 2024
dc2c9ba
remove outdated tests
jnke2016 Sep 28, 2024
c01f4e4
add SG python implementation of neighborhood sampling both homogeneou…
jnke2016 Sep 30, 2024
dabd0c8
remove unused argument
jnke2016 Sep 30, 2024
95ca286
add tests for homogeneous uniform neighborhood sampling
jnke2016 Oct 16, 2024
383bfc4
add method to fill a buffer array with a scalar
jnke2016 Oct 21, 2024
68fa2f1
add method to sort and count unique elements in buffer array
jnke2016 Oct 21, 2024
18899ed
add method to sort and count unique elements in buffer array
jnke2016 Oct 21, 2024
57e6f96
update computation of map from label to comm rank
jnke2016 Oct 21, 2024
d34d85c
perform allgatherv of the local mapping from label to comm rank
jnke2016 Oct 22, 2024
2aa0903
udpate tests for 'mg_homogeneous_uniform_neighbor_sampling'
jnke2016 Oct 22, 2024
fa0cb88
update neighbor sampling call for 'NO_CUGRAPH_OPS'
jnke2016 Oct 22, 2024
990d2ed
remove outdated code
jnke2016 Oct 22, 2024
d44a46f
udpate tests for 'mg_homogeneous_biased_neighbor_sampling'
jnke2016 Oct 22, 2024
a1a6180
add mg tests for heterogeneous uniform and biased neighborhood sampling
jnke2016 Oct 22, 2024
36ce4fc
add new tests to CMakeLists
jnke2016 Oct 22, 2024
c0a618f
remove unsued variable
jnke2016 Oct 22, 2024
965001b
fix illegal memory access
jnke2016 Oct 22, 2024
a005a19
update branch with the latest changes
jnke2016 Oct 22, 2024
11f9c40
fix style
jnke2016 Oct 22, 2024
6b547fb
fix style
jnke2016 Oct 22, 2024
14e9a99
update cmakelist
jnke2016 Oct 22, 2024
aebfd08
update type combination
jnke2016 Oct 22, 2024
b159e1b
fix symbol lookup error
jnke2016 Oct 30, 2024
3b0c016
leverage raft span instead of raw pointers
jnke2016 Oct 30, 2024
da82567
remove python implementation of heterogeneous neighborhood sampling a…
jnke2016 Oct 30, 2024
5b1bbb4
undo changes to uniform neighbor sampling
jnke2016 Oct 30, 2024
6d69f88
remove o utdated fixme
jnke2016 Oct 30, 2024
79d4527
remove unnecessary call
jnke2016 Oct 30, 2024
2a66928
remove unnecessary blank line
jnke2016 Oct 30, 2024
8c3d871
add comments for deprecated functions
jnke2016 Oct 30, 2024
ae92c9f
remove obsolete instantiation
jnke2016 Oct 30, 2024
d7d6109
remove unnecessary parenthesis
jnke2016 Oct 30, 2024
6b3ffbd
remove obsolete instantiation
jnke2016 Oct 30, 2024
67d7d0a
fix style
jnke2016 Oct 30, 2024
3386230
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Oct 30, 2024
413a577
fix type error
jnke2016 Oct 30, 2024
22db98d
fix import error
jnke2016 Oct 30, 2024
6a95852
remove redundant tests
jnke2016 Oct 31, 2024
4f5dc3e
remove hardcoded path
jnke2016 Oct 31, 2024
334ce6d
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Oct 31, 2024
9b4683a
fix style
jnke2016 Oct 31, 2024
8cb0c94
rename 'd_value' to 'd_span'
jnke2016 Nov 1, 2024
77303d5
use 'label_list' as a map form 'comm_rank' to 'label_map'
jnke2016 Nov 5, 2024
117a4ec
use 'label_list' as a map form 'comm_rank' to 'label_map'
jnke2016 Nov 5, 2024
ac66f13
add module biased_neighbor_sample
jnke2016 Nov 6, 2024
114bf56
rename variable
jnke2016 Nov 7, 2024
63a59ca
avoid creating function that compile all types and be more explicit w…
jnke2016 Nov 7, 2024
84face3
remove unsued function
jnke2016 Nov 8, 2024
ab853e5
declare homogeneous functions first
jnke2016 Nov 8, 2024
802d9b0
rename variable
jnke2016 Nov 8, 2024
c5bec5f
remove duplicated functions
jnke2016 Nov 8, 2024
4bd09f6
reorder variable declaration
jnke2016 Nov 8, 2024
5d6cb34
add fixme for not testing edge masking
jnke2016 Nov 8, 2024
24e31cb
remove outdated fixme
jnke2016 Nov 8, 2024
7ca5d59
fix style
jnke2016 Nov 8, 2024
64b7edc
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Nov 8, 2024
4a1d3f9
list homogeneous sampling functions first
jnke2016 Nov 8, 2024
86c819c
update docstrings
jnke2016 Nov 8, 2024
c011a7a
update docstrings
jnke2016 Nov 8, 2024
013ccbd
fix typo
jnke2016 Nov 8, 2024
b5d0505
detach mask
jnke2016 Nov 8, 2024
a759293
fix style
jnke2016 Nov 8, 2024
4dc00d1
update docstrings
jnke2016 Nov 12, 2024
de0c66f
update docstrings
jnke2016 Nov 12, 2024
a2dcc6f
update docstrings
jnke2016 Nov 12, 2024
3e03324
update docstrings
jnke2016 Nov 12, 2024
6d50df8
fix typo and remove check
jnke2016 Nov 14, 2024
ed6d532
reorder instructions
jnke2016 Nov 14, 2024
8a3a774
add docstring examples
jnke2016 Nov 14, 2024
57ba1e8
fix style
jnke2016 Nov 14, 2024
1011095
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Nov 14, 2024
b827ac7
update docstrings
jnke2016 Nov 14, 2024
27eb500
add more docstring example
jnke2016 Nov 14, 2024
c6ea067
add type chec, remove outdated docstrings
jnke2016 Nov 14, 2024
db468b6
fix style
jnke2016 Nov 14, 2024
d866235
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Nov 14, 2024
978281e
add functions exposing the edge renumber map along with its offsets a…
jnke2016 Nov 15, 2024
cd68019
add FIXME
jnke2016 Nov 15, 2024
ccfadc9
expose edge renumber map along with its offsets to the PLC API
jnke2016 Nov 15, 2024
dc203dd
fix style
jnke2016 Nov 15, 2024
69afe17
update docstrings example
jnke2016 Nov 15, 2024
fa44832
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Nov 15, 2024
3d9b526
remove outdated arguments
jnke2016 Nov 15, 2024
9edd3ae
fix style
jnke2016 Nov 15, 2024
c8b5875
rename methods
jnke2016 Nov 15, 2024
65a0225
fix style
jnke2016 Nov 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 115 additions & 1 deletion cpp/include/cugraph/sampling_functions.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ enum class prior_sources_behavior_t { DEFAULT = 0, CARRY_OVER, EXCLUDE };

/**
* @brief Uniform Neighborhood Sampling.
*
* @deprecated This API will be deleted, use neighbor_sample instead
*
* This function traverses from a set of starting vertices, traversing outgoing edges and
* randomly selects from these outgoing neighbors to extract a subgraph.
Expand Down Expand Up @@ -129,7 +131,7 @@ uniform_neighbor_sample(
std::optional<raft::device_span<label_t const>> starting_vertex_labels,
std::optional<std::tuple<raft::device_span<label_t const>, raft::device_span<int32_t const>>>
label_to_output_comm_rank,
raft::host_span<int32_t const> fan_out,
std::optional<raft::host_span<int32_t const>> fan_out,
jnke2016 marked this conversation as resolved.
Show resolved Hide resolved
raft::random::RngState& rng_state,
bool return_hops,
bool with_replacement = true,
Expand All @@ -139,6 +141,8 @@ uniform_neighbor_sample(

/**
* @brief Biased Neighborhood Sampling.
*
* @deprecated This API will be deleted, use neighbor_sample instead
*
* This function traverses from a set of starting vertices, traversing outgoing edges and
* randomly selects (with edge biases) from these outgoing neighbors to extract a subgraph.
Expand Down Expand Up @@ -240,6 +244,116 @@ biased_neighbor_sample(
bool dedupe_sources = false,
bool do_expensive_check = false);


/**
* @brief Neighborhood Sampling.
*
* This function traverses from a set of starting vertices, traversing outgoing edges and
* randomly selects (with edge biases or not) from these outgoing neighbors to extract a subgraph.
*
* Output from this function is a tuple of vectors (src, dst, weight, edge_id, edge_type, hop,
* label, offsets), identifying the randomly selected edges. src is the source vertex, dst is the
* destination vertex, weight (optional) is the edge weight, edge_id (optional) identifies the edge
* id, edge_type (optional) identifies the edge type, hop identifies which hop the edge was
* encountered in. The label output (optional) identifes the vertex label. The offsets array
* (optional) will be described below and is dependent upon the input parameters.
*
* If @p starting_vertex_labels is not specified then no organization is applied to the output, the
* label and offsets values in the return set will be std::nullopt.
*
* If @p starting_vertex_labels is specified and @p label_to_output_comm_rank is not specified then
* the label output has values. This will also result in the output being sorted by vertex label.
* The offsets array in the return will be a CSR-style offsets array to identify the beginning of
* each label range in the data. `labels.size() == (offsets.size() - 1)`.
*
* If @p starting_vertex_labels is specified and @p label_to_output_comm_rank is specified then the
* label output has values. This will also result in the output being sorted by vertex label. The
* offsets array in the return will be a CSR-style offsets array to identify the beginning of each
* label range in the data. `labels.size() == (offsets.size() - 1)`. Additionally, the data will
* be shuffled so that all data with a particular label will be on the specified rank.
*
* @tparam vertex_t Type of vertex identifiers. Needs to be an integral type.
* @tparam edge_t Type of edge identifiers. Needs to be an integral type.
* @tparam weight_t Type of edge weights. Needs to be a floating point type.
* @tparam edge_type_t Type of edge type. Needs to be an integral type.
* @tparam label_t Type of label. Needs to be an integral type.
* @tparam store_transposed Flag indicating whether sources (if false) or destinations (if
* true) are major indices
* @tparam multi_gpu Flag indicating whether template instantiation should target single-GPU (false)
* @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
* handles to various CUDA libraries) to run graph algorithms.
* * @param rng_state A pre-initialized raft::RngState object for generating random numbers
* @param graph_view Graph View object to generate NBR Sampling on.
* @param edge_weight_view Optional view object holding edge weights for @p graph_view.
* @param edge_id_view Optional view object holding edge ids for @p graph_view.
* @param edge_type_view Optional view object holding edge types for @p graph_view.
* @param edge_bias_view Optional view object holding edge biases (to be used in biased sampling) for @p
ChuckHastings marked this conversation as resolved.
Show resolved Hide resolved
* graph_view. Bias values should be non-negative and the sum of edge bias values from any vertex
* should not exceed std::numeric_limits<bias_t>::max(). 0 bias value indicates that the
* corresponding edge can never be selected.
* @param starting_vertices Device span of starting vertex IDs for the sampling.
* In a multi-gpu context the starting vertices should be local to this GPU.
* @param starting_vertex_labels Optional device span of labels associted with each starting vertex
* for the sampling.
* @param label_to_output_comm_rank Optional tuple of device spans mapping label to a particular
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some further API changes I want to propose here. I'll point you to a PR when they're ready.

* output rank. Element 0 of the tuple identifes the label, Element 1 of the tuple identifies the
* output rank. The label span must be sorted in ascending order.
* @param fan_out Host span defining branching out (fan-out) degree per source vertex for each
jnke2016 marked this conversation as resolved.
Show resolved Hide resolved
* level
* @param heterogeneous_fan_out Tuple of host spans defining branching out (fan-out) degree per
* source vertex for each level in CSR style format. The first element of the tuple is the offset
* array per edge type id and the second element correspond to the fanout values.
* @param return_hops boolean flag specifying if the hop information should be returned
* @param prior_sources_behavior Enum type defining how to handle prior sources, (defaults to
* DEFAULT)
* @param dedupe_sources boolean flag, if true then if a vertex v appears as a destination in hop X
* multiple times with the same label, it will only be passed once (for each label) as a source
* for the next hop. Default is false.
* @param with_replacement boolean flag specifying if random sampling is done with replacement
* (true); or, without replacement (false); default = true;
* @param do_expensive_check A flag to run expensive checks for input arguments (if set to `true`).
* @return tuple device vectors (vertex_t source_vertex, vertex_t destination_vertex,
* optional weight_t weight, optional edge_t edge id, optional edge_type_t edge type,
* optional int32_t hop, optional label_t label, optional size_t offsets)
*/
// FIXME: Add flag for bias=True/False
ChuckHastings marked this conversation as resolved.
Show resolved Hide resolved
template <typename vertex_t,
typename edge_t,
typename weight_t,
typename edge_type_t,
typename bias_t,
typename label_t,
bool store_transposed,
bool multi_gpu>
std::tuple<rmm::device_uvector<vertex_t>,
rmm::device_uvector<vertex_t>,
std::optional<rmm::device_uvector<weight_t>>,
std::optional<rmm::device_uvector<edge_t>>,
std::optional<rmm::device_uvector<edge_type_t>>,
std::optional<rmm::device_uvector<int32_t>>,
std::optional<rmm::device_uvector<label_t>>,
std::optional<rmm::device_uvector<size_t>>>
neighbor_sample(
raft::handle_t const& handle,
raft::random::RngState& rng_state,
graph_view_t<vertex_t, edge_t, store_transposed, multi_gpu> const& graph_view,
std::optional<edge_property_view_t<edge_t, weight_t const*>> edge_weight_view,
std::optional<edge_property_view_t<edge_t, edge_t const*>> edge_id_view,
std::optional<edge_property_view_t<edge_t, edge_type_t const*>> edge_type_view,
std::optional<edge_property_view_t<edge_t, bias_t const*>> edge_bias_view,
raft::device_span<vertex_t const> starting_vertices,
std::optional<raft::device_span<label_t const>> starting_vertex_labels,
std::optional<std::tuple<raft::device_span<label_t const>, raft::device_span<int32_t const>>>
label_to_output_comm_rank,
std::optional<raft::host_span<int32_t const>> fan_out,
std::optional<std::tuple<raft::host_span<int32_t const>, raft::host_span<int32_t const>>>
heterogeneous_fan_out,
bool return_hops,
bool with_replacement = true,
prior_sources_behavior_t prior_sources_behavior = prior_sources_behavior_t::DEFAULT,
bool dedupe_sources = false,
bool do_expensive_check = false);

/*
* @brief renumber sampled edge list and compress to the (D)CSR|(D)CSC format.
*
Expand Down
119 changes: 119 additions & 0 deletions cpp/include/cugraph_c/sampling_algorithms.h
Original file line number Diff line number Diff line change
Expand Up @@ -319,8 +319,59 @@ void cugraph_sampling_set_dedupe_sources(cugraph_sampling_options_t* options, bo
*/
void cugraph_sampling_options_free(cugraph_sampling_options_t* options);

/**
* @brief Opaque neighborhood sampling heterogeneous fanout type
*/
// FIXME: internal representation should be tuple instead of pairs - Make it more generic (tuple)
jnke2016 marked this conversation as resolved.
Show resolved Hide resolved
// cugraph_device_tuple_t, host_device_tuple_t,
// dictionary, key and array
// translate dictionary to a tuple. Add to the draft PR the PLC layer.
// Concatenate to build the 3 arrays from the PLC layer
/// mimic
typedef struct {
int32_t align_;
} cugraph_sample_heterogeneous_fanout_t;

/**
jnke2016 marked this conversation as resolved.
Show resolved Hide resolved
* @brief Create heterogeneous fanout
*
* Input data will be stored in the heterogenous_fanout.
jnke2016 marked this conversation as resolved.
Show resolved Hide resolved
*
* The fanout is going to be a CSR structure, the edge_type_offsets will define which range
* of the fanout array is associated with each edge type, the fanout will be the values of
* fanout for that hop/type. So for edge type k, fanout[edge_type_offsets[k]] will identify
* the fanout for hop 0 for edge type k. fanout[edge_type_offsets[k] +1] will identify the
* fanout for hop 1, etc. edge_type_offsets[k+1] will mark the beginning of the fanout
* array for type k+1 (and the end of the fanout array for type k.
*
* @param [in] handle Handle for accessing resources
* @param [in] graph Pointer to graph
* @param [in] edge_type_offsets Type erased array of edge type offsets
* @param [in] fanout Type erased array of fanout values
* @param [out] heterogeneous_fanout Opaque pointer to fanout_t
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_create_heterogeneous_fanout(
const cugraph_resource_handle_t* handle,
cugraph_graph_t* graph,
const cugraph_type_erased_host_array_view_t* edge_type_offsets,
const cugraph_type_erased_host_array_view_t* fanout,
cugraph_sample_heterogeneous_fanout_t** heterogeneous_fanout,
cugraph_error_t** error);

/**
* @brief Free edge type and fanout pairs
*
* @param [in] heterogeneous_fanout The edge type size and fanout values
*/
void cugraph_heterogeneous_fanout_free(cugraph_sample_heterogeneous_fanout_t* heterogeneous_fanout);

/**
* @brief Uniform Neighborhood Sampling
*
* @deprecated This API will be deleted, use cugraph_neighbor_sample instead
*
* Returns a sample of the neighborhood around specified start vertices. Optionally, each
* start vertex can be associated with a label, allowing the caller to specify multiple batches
Expand Down Expand Up @@ -376,6 +427,8 @@ cugraph_error_code_t cugraph_uniform_neighbor_sample(

/**
* @brief Biased Neighborhood Sampling
*
* @deprecated This API will be deleted, use cugraph_neighbor_sample instead
*
* Returns a sample of the neighborhood around specified start vertices. Optionally, each
* start vertex can be associated with a label, allowing the caller to specify multiple batches
Expand Down Expand Up @@ -433,6 +486,71 @@ cugraph_error_code_t cugraph_biased_neighbor_sample(
cugraph_sample_result_t** result,
cugraph_error_t** error);

/**
* @brief Neighborhood Sampling
*
* Returns a sample of the neighborhood around specified start vertices with edge biases or not.
* Optionally, each start vertex can be associated with a label, allowing the caller to specify
* multiple batches of sampling requests in the same function call - which should improve GPU
* utilization.
*
* If label is NULL then all start vertices will be considered part of the same batch and the
* return value will not have a label column.
*
* @param [in] handle Handle for accessing resources
* * @param [in,out] rng_state State of the random number generator, updated with each call
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage
* needs to be transposed
* @param [in] edge_biases Device array of edge biases to use for sampling. If NULL
* use the edge weight as the bias. If set to NULL, edges will be sampled uniformly.
* @param [in] start_vertices Device array of start vertices for the sampling
* @param [in] start_vertex_labels Device array of start vertex labels for the sampling. The
* labels associated with each start vertex will be included in the output associated with results
* that were derived from that start vertex. We only support label of type INT32. If label is
* NULL, the return data will not be labeled.
* @param [in] label_list Device array of the labels included in @p start_vertex_labels. If
* @p label_to_comm_rank is not specified this parameter is ignored. If specified, label_list
* must be sorted in ascending order.
* @param [in] label_to_comm_rank Device array identifying which comm rank the output for a
* particular label should be shuffled in the output. If not specifed the data is not organized in
* output. If specified then the all data from @p label_list[i] will be shuffled to rank @p. This
* cannot be specified unless @p start_vertex_labels is also specified
* label_to_comm_rank[i]. If not specified then the output data will not be shuffled between ranks.
* @param [in] label_offsets Device array of the offsets for each label in the seed list. This
* parameter is only used with the retain_seeds option.
* @param [in] fanout Host array defining the fan out at each step in the sampling algorithm.
* We only support fanout values of type INT32
* @param [in] heterogeneous_fanout Tuple of host arrays defining the fan out at each step in the
* sampling algorithm. in CSR style format. The first element of the tuple is the offset array per
* edge type id and the second element correspond to the fanout values.
* We only support type INT32 for both the offsets and the fanout values array.
* @param [in] sampling_options
* Opaque pointer defining the sampling options.
* @param [in] do_expensive_check
* A flag to run expensive checks for input arguments (if set to true)
* @param [out] result Output from the uniform_neighbor_sample call
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_neighbor_sample(
const cugraph_resource_handle_t* handle,
cugraph_rng_state_t* rng_state,
cugraph_graph_t* graph,
bool_t is_biased,
ChuckHastings marked this conversation as resolved.
Show resolved Hide resolved
jnke2016 marked this conversation as resolved.
Show resolved Hide resolved
const cugraph_edge_property_view_t* edge_biases,
const cugraph_type_erased_device_array_view_t* start_vertices,
const cugraph_type_erased_device_array_view_t* start_vertex_labels,
const cugraph_type_erased_device_array_view_t* label_list,
const cugraph_type_erased_device_array_view_t* label_to_comm_rank,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this one also.

const cugraph_type_erased_device_array_view_t* label_offsets,
const cugraph_type_erased_host_array_view_t* fan_out,
const cugraph_sample_heterogeneous_fanout_t* heterogeneous_fanout,
const cugraph_sampling_options_t* options,
bool_t do_expensive_check,
cugraph_sample_result_t** result,
cugraph_error_t** error);
jnke2016 marked this conversation as resolved.
Show resolved Hide resolved

/**
* @deprecated This call should be replaced with cugraph_sample_result_get_majors
* @brief Get the source vertices from the sampling algorithm result
Expand Down Expand Up @@ -667,6 +785,7 @@ cugraph_error_code_t cugraph_test_uniform_neighborhood_sample_result_create(
* not CUGRAPH_SUCCESS
* @return error code
*/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary blank line

cugraph_error_code_t cugraph_select_random_vertices(const cugraph_resource_handle_t* handle,
const cugraph_graph_t* graph,
cugraph_rng_state_t* rng_state,
Expand Down
21 changes: 21 additions & 0 deletions cpp/src/c_api/array.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,27 @@ struct cugraph_type_erased_host_array_t {
std::copy(vec.begin(), vec.end(), reinterpret_cast<T*>(data_.get()));
}

cugraph_type_erased_host_array_t(cugraph_type_erased_host_array_view_t const* view_p)
: data_(std::make_unique<std::byte[]>(view_p->num_bytes_)),
size_(view_p->size_),
num_bytes_(view_p->num_bytes_),
type_(view_p->type_)
{
std::copy(view_p->data_, view_p->data_ + num_bytes_, data_.get());
}

template <typename T>
T* as_type()
{
return reinterpret_cast<T*>(data_.get());
}

template <typename T>
T const* as_type() const
{
return reinterpret_cast<T const*>(data_.get());
}

auto view()
{
return new cugraph_type_erased_host_array_view_t{data_.get(), size_, num_bytes_, type_};
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/c_api/graph_functions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ struct create_vertex_pairs_functor : public cugraph::c_api::abstract_functor {
std::nullopt,
std::nullopt);
}

// FIXME: use std::tuple (template) instead.
result_ = new cugraph::c_api::cugraph_vertex_pairs_t{
new cugraph::c_api::cugraph_type_erased_device_array_t(first_copy, graph_->vertex_type_),
new cugraph::c_api::cugraph_type_erased_device_array_t(second_copy, graph_->vertex_type_)};
Expand Down
Loading
Loading