Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support heterogenous fanout type #4608

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
153 commits
Select commit Hold shift + click to select a range
0adb2fd
support heterogenous fanout type
jnke2016 Aug 13, 2024
bb5a3e2
remove unusued code
jnke2016 Aug 13, 2024
10fa86d
fix style
jnke2016 Aug 13, 2024
f904350
create one API for both uniform and biased neighborhood sampling
jnke2016 Aug 20, 2024
1fc32c3
use the same function for both uniform and biased nieghborhood sampling
jnke2016 Aug 20, 2024
8fc21f8
add support for heterogenous fanout support at the plc layer and cons…
jnke2016 Aug 20, 2024
01a57f3
remove outdated codes
jnke2016 Aug 20, 2024
3a6aeb2
add flag differentiating between biased and uniform sampling
jnke2016 Aug 21, 2024
d2f6467
update docstrings and rename variable
jnke2016 Aug 21, 2024
5d25155
rename variable
jnke2016 Aug 21, 2024
80f8b86
create new tuple type
jnke2016 Aug 21, 2024
50e0fc5
remove unnecessary check
jnke2016 Aug 21, 2024
9f455bf
add constructor converting from array_view_t to array_t
jnke2016 Aug 21, 2024
d114534
leverage new constructor and remove unnecessary code
jnke2016 Aug 21, 2024
cf4a3ae
ensure edge types are ordered in increasing order
jnke2016 Aug 21, 2024
bc87b50
update docstrings
jnke2016 Aug 21, 2024
3013684
update docstrings
jnke2016 Aug 21, 2024
d6b6234
undo changes to uniform neighbor sample
jnke2016 Aug 22, 2024
068b0a3
undo changes to uniform neighbor sample
jnke2016 Aug 22, 2024
6920f65
update docstrings
jnke2016 Aug 22, 2024
760c5cd
re-order arguments
jnke2016 Aug 22, 2024
1e0ef27
remove outdated comments
jnke2016 Aug 22, 2024
de79620
add arguments and type check
jnke2016 Aug 23, 2024
8c17009
rename variable for consistency
jnke2016 Aug 23, 2024
7b95c5e
update neighbor sample API
jnke2016 Aug 30, 2024
19fc765
remove outdated code
jnke2016 Aug 30, 2024
e30766c
remove outdated comment
jnke2016 Aug 30, 2024
5dd66f2
first cut at new sampling function definition to clean up things befo…
ChuckHastings Sep 4, 2024
4b2764c
updates to remove builder pattern, also rename functions and mark old…
ChuckHastings Sep 5, 2024
4c1c610
add implementation of heterogeneous neighborhood sampling
jnke2016 Sep 9, 2024
fe35c80
add exit condition
jnke2016 Sep 9, 2024
a658b29
remove comments
jnke2016 Sep 10, 2024
e52a38a
Add Implementation
ChuckHastings Sep 11, 2024
c416439
call heterogeneous renumbering
jnke2016 Sep 13, 2024
98d6c57
update branch and call heterogneous renumbering
jnke2016 Sep 13, 2024
d7165af
update heterogeneous renumbering call
jnke2016 Sep 17, 2024
579fd0a
create a csr data structure to efficiently store vertex and label
jnke2016 Sep 17, 2024
5cdf40a
update API and docstring
jnke2016 Sep 17, 2024
a8fbd9d
remove unsued variable
jnke2016 Sep 17, 2024
9d5b3dd
update C++ API for neighbor sampling
jnke2016 Sep 20, 2024
0358c6e
add fixme for deprecated flags
jnke2016 Sep 20, 2024
799c35d
update CAPI
jnke2016 Sep 20, 2024
ab8aa72
undo changes to k-truss
jnke2016 Sep 21, 2024
7d8b5ad
undo changes to tests
jnke2016 Sep 21, 2024
f2190ba
clean up code
jnke2016 Sep 21, 2024
1e96dcf
update docs
jnke2016 Sep 23, 2024
36c25ad
fix typo
jnke2016 Sep 23, 2024
4857b36
call scatter instead of gather and fix type bug
jnke2016 Sep 23, 2024
263b6ac
fix typo
jnke2016 Sep 23, 2024
9dff3ab
update neighbor sample API
jnke2016 Sep 24, 2024
33c8b3d
update CAPI
jnke2016 Sep 25, 2024
e357f42
remove unsued code
jnke2016 Sep 25, 2024
6081978
remove outdated comment
jnke2016 Sep 25, 2024
73b3ffe
remove unnecessary copy
jnke2016 Sep 25, 2024
ea972f3
remove outdate arguments
jnke2016 Sep 26, 2024
8822192
fix typo
jnke2016 Sep 27, 2024
e02a513
update plc API of heterogeneous neighbor sample
jnke2016 Sep 27, 2024
d6cb1d5
fix typo
jnke2016 Sep 27, 2024
54fa155
change back the fanout type from a sparse to a dense structure
jnke2016 Sep 27, 2024
499e041
fix typo
jnke2016 Sep 27, 2024
b571deb
add implementation of heterogeneous/homogeneous biased/uniform neighb…
jnke2016 Sep 27, 2024
f6c4ce3
properly handle edge types
jnke2016 Sep 27, 2024
e71660d
add tests for 'homogeneous_uniform_neighbor_sampling'
jnke2016 Sep 27, 2024
4e2c8cf
add tests for homogeneous_biased_neighbor_sampling.cpp
jnke2016 Sep 27, 2024
2458149
update type combination
jnke2016 Sep 27, 2024
df3e4ff
add tests for heterogeneous uniform/biased neighborhood sampling
jnke2016 Sep 28, 2024
d4847e4
properly sample with edge types
jnke2016 Sep 28, 2024
dc2c9ba
remove outdated tests
jnke2016 Sep 28, 2024
c01f4e4
add SG python implementation of neighborhood sampling both homogeneou…
jnke2016 Sep 30, 2024
dabd0c8
remove unused argument
jnke2016 Sep 30, 2024
95ca286
add tests for homogeneous uniform neighborhood sampling
jnke2016 Oct 16, 2024
383bfc4
add method to fill a buffer array with a scalar
jnke2016 Oct 21, 2024
68fa2f1
add method to sort and count unique elements in buffer array
jnke2016 Oct 21, 2024
18899ed
add method to sort and count unique elements in buffer array
jnke2016 Oct 21, 2024
57e6f96
update computation of map from label to comm rank
jnke2016 Oct 21, 2024
d34d85c
perform allgatherv of the local mapping from label to comm rank
jnke2016 Oct 22, 2024
2aa0903
udpate tests for 'mg_homogeneous_uniform_neighbor_sampling'
jnke2016 Oct 22, 2024
fa0cb88
update neighbor sampling call for 'NO_CUGRAPH_OPS'
jnke2016 Oct 22, 2024
990d2ed
remove outdated code
jnke2016 Oct 22, 2024
d44a46f
udpate tests for 'mg_homogeneous_biased_neighbor_sampling'
jnke2016 Oct 22, 2024
a1a6180
add mg tests for heterogeneous uniform and biased neighborhood sampling
jnke2016 Oct 22, 2024
36ce4fc
add new tests to CMakeLists
jnke2016 Oct 22, 2024
c0a618f
remove unsued variable
jnke2016 Oct 22, 2024
965001b
fix illegal memory access
jnke2016 Oct 22, 2024
a005a19
update branch with the latest changes
jnke2016 Oct 22, 2024
11f9c40
fix style
jnke2016 Oct 22, 2024
6b547fb
fix style
jnke2016 Oct 22, 2024
14e9a99
update cmakelist
jnke2016 Oct 22, 2024
aebfd08
update type combination
jnke2016 Oct 22, 2024
b159e1b
fix symbol lookup error
jnke2016 Oct 30, 2024
3b0c016
leverage raft span instead of raw pointers
jnke2016 Oct 30, 2024
da82567
remove python implementation of heterogeneous neighborhood sampling a…
jnke2016 Oct 30, 2024
5b1bbb4
undo changes to uniform neighbor sampling
jnke2016 Oct 30, 2024
6d69f88
remove o utdated fixme
jnke2016 Oct 30, 2024
79d4527
remove unnecessary call
jnke2016 Oct 30, 2024
2a66928
remove unnecessary blank line
jnke2016 Oct 30, 2024
8c3d871
add comments for deprecated functions
jnke2016 Oct 30, 2024
ae92c9f
remove obsolete instantiation
jnke2016 Oct 30, 2024
d7d6109
remove unnecessary parenthesis
jnke2016 Oct 30, 2024
6b3ffbd
remove obsolete instantiation
jnke2016 Oct 30, 2024
67d7d0a
fix style
jnke2016 Oct 30, 2024
3386230
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Oct 30, 2024
413a577
fix type error
jnke2016 Oct 30, 2024
22db98d
fix import error
jnke2016 Oct 30, 2024
6a95852
remove redundant tests
jnke2016 Oct 31, 2024
4f5dc3e
remove hardcoded path
jnke2016 Oct 31, 2024
334ce6d
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Oct 31, 2024
9b4683a
fix style
jnke2016 Oct 31, 2024
8cb0c94
rename 'd_value' to 'd_span'
jnke2016 Nov 1, 2024
77303d5
use 'label_list' as a map form 'comm_rank' to 'label_map'
jnke2016 Nov 5, 2024
117a4ec
use 'label_list' as a map form 'comm_rank' to 'label_map'
jnke2016 Nov 5, 2024
ac66f13
add module biased_neighbor_sample
jnke2016 Nov 6, 2024
114bf56
rename variable
jnke2016 Nov 7, 2024
63a59ca
avoid creating function that compile all types and be more explicit w…
jnke2016 Nov 7, 2024
84face3
remove unsued function
jnke2016 Nov 8, 2024
ab853e5
declare homogeneous functions first
jnke2016 Nov 8, 2024
802d9b0
rename variable
jnke2016 Nov 8, 2024
c5bec5f
remove duplicated functions
jnke2016 Nov 8, 2024
4bd09f6
reorder variable declaration
jnke2016 Nov 8, 2024
5d6cb34
add fixme for not testing edge masking
jnke2016 Nov 8, 2024
24e31cb
remove outdated fixme
jnke2016 Nov 8, 2024
7ca5d59
fix style
jnke2016 Nov 8, 2024
64b7edc
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Nov 8, 2024
4a1d3f9
list homogeneous sampling functions first
jnke2016 Nov 8, 2024
86c819c
update docstrings
jnke2016 Nov 8, 2024
c011a7a
update docstrings
jnke2016 Nov 8, 2024
013ccbd
fix typo
jnke2016 Nov 8, 2024
b5d0505
detach mask
jnke2016 Nov 8, 2024
a759293
fix style
jnke2016 Nov 8, 2024
4dc00d1
update docstrings
jnke2016 Nov 12, 2024
de0c66f
update docstrings
jnke2016 Nov 12, 2024
a2dcc6f
update docstrings
jnke2016 Nov 12, 2024
3e03324
update docstrings
jnke2016 Nov 12, 2024
6d50df8
fix typo and remove check
jnke2016 Nov 14, 2024
ed6d532
reorder instructions
jnke2016 Nov 14, 2024
8a3a774
add docstring examples
jnke2016 Nov 14, 2024
57ba1e8
fix style
jnke2016 Nov 14, 2024
1011095
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Nov 14, 2024
b827ac7
update docstrings
jnke2016 Nov 14, 2024
27eb500
add more docstring example
jnke2016 Nov 14, 2024
c6ea067
add type chec, remove outdated docstrings
jnke2016 Nov 14, 2024
db468b6
fix style
jnke2016 Nov 14, 2024
d866235
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Nov 14, 2024
978281e
add functions exposing the edge renumber map along with its offsets a…
jnke2016 Nov 15, 2024
cd68019
add FIXME
jnke2016 Nov 15, 2024
ccfadc9
expose edge renumber map along with its offsets to the PLC API
jnke2016 Nov 15, 2024
dc203dd
fix style
jnke2016 Nov 15, 2024
69afe17
update docstrings example
jnke2016 Nov 15, 2024
fa44832
Merge remote-tracking branch 'upstream/branch-24.12' into branch-24.1…
jnke2016 Nov 15, 2024
3d9b526
remove outdated arguments
jnke2016 Nov 15, 2024
9edd3ae
fix style
jnke2016 Nov 15, 2024
c8b5875
rename methods
jnke2016 Nov 15, 2024
65a0225
fix style
jnke2016 Nov 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ set(CUGRAPH_SOURCES
src/detail/groupby_and_count_mg_v64_e64.cu
src/detail/collect_comm_wrapper_mg_v32_e32.cu
src/detail/collect_comm_wrapper_mg_v64_e64.cu
src/sampling/detail/conversion_utilities.cu
src/sampling/random_walks_mg_v64_e64.cu
src/sampling/random_walks_mg_v32_e32.cu
src/community/detail/common_methods_mg_v64_e64.cu
Expand Down Expand Up @@ -264,10 +265,10 @@ set(CUGRAPH_SOURCES
src/sampling/detail/sample_edges_mg_v32_e32.cu
src/sampling/detail/shuffle_and_organize_output_mg_v64_e64.cu
src/sampling/detail/shuffle_and_organize_output_mg_v32_e32.cu
src/sampling/neighbor_sampling_mg_v32_e32.cpp
src/sampling/neighbor_sampling_mg_v64_e64.cpp
src/sampling/neighbor_sampling_sg_v32_e32.cpp
src/sampling/neighbor_sampling_sg_v64_e64.cpp
src/sampling/neighbor_sampling_mg_v32_e32.cu
src/sampling/neighbor_sampling_mg_v64_e64.cu
src/sampling/neighbor_sampling_sg_v32_e32.cu
src/sampling/neighbor_sampling_sg_v64_e64.cu
src/sampling/negative_sampling_sg_v32_e32.cu
src/sampling/negative_sampling_sg_v64_e64.cu
src/sampling/negative_sampling_mg_v32_e32.cu
Expand Down
44 changes: 43 additions & 1 deletion cpp/include/cugraph/detail/utility_wrappers.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,48 @@ void uniform_random_fill(rmm::cuda_stream_view const& stream_view,
template <typename value_t>
void scalar_fill(raft::handle_t const& handle, value_t* d_value, size_t size, value_t value);

/**
* @brief Sort a device span
*
* @tparam value_t type of the value to operate on. Must be either int32_t or int64_t.
*
* @param [in] handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator,
* and handles to various CUDA libraries) to run graph algorithms.
* @param[out] values device span to sort
*
*/
template <typename value_t>
void sort_ints(raft::handle_t const& handle, raft::device_span<value_t> values);

/**
* @brief Keep unique element from a device span
*
* @tparam value_t type of the value to operate on. Must be either int32_t or int64_t.
*
* @param [in] handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator,
* and handles to various CUDA libraries) to run graph algorithms.
* @param[in] values device span of unique elements.
* @return the number of unique elements.
*
*/
template <typename value_t>
size_t unique_ints(raft::handle_t const& handle, raft::device_span<value_t> values);

/**
* @brief Increment the values of a device span by a constant value
*
* @tparam value_t type of the value to operate on. Must be either int32_t or int64_t.
*
* @param[out] values device span to update
* @param[in] value value to be added to each element of the buffer
* @param[in] stream_view stream view
*
*/
template <typename value_t>
void transform_increment_ints(raft::device_span<value_t> values,
value_t value,
rmm::cuda_stream_view const& stream_view);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here. I am not sure creating a thrust wrapper for arbitrary types is a good idea or not.

For commonly used types, we can clearly cut compile time and binary size by doing this.

In such case, I am inclined to better naming functions to indicate the supported types or at least properly document the supported types.

For example, for the sort function here,

  1. We may rename the function to sort_vertices or at least sort_ints to indicate that this works only for integers and document the supported integer types (e.g. int32_t, int64_t). If we explicitly instantiated this function for floating point numbers as well, then we may create sort_floats as well.
  2. Or at the very minimum, we need to document the supported types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And our general convention is to pass stream as the last parameter.

Here, we are passing handle in some functions and stream in other functions. And passing stream as the last parameter when we are passing stream.

Better be consistent. I think we should pass stream as the last parameter consistently for the functions defined in this header file to allow calling these functions in multi-stream executions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChuckHastings Any thoughts on this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stream should be the last parameter, I think. We should do some review of the code and identify other places where we should be passing stream instead of handle. I think passing the handle into the algorithm is great, since it gives us access to everything. But I had to do some complex things in MTMG to get some of the lower level functions working in a multi-stream environment because we use the handle too much. I think we should look at many of the non-public functions and explore passing the comms object and stream instead of passing the handle.

Regarding these wrappers for thrust calls, I think we'll end up with higher quality code if we have function names that are more precise about what we're doing. I think sort_ints might be sufficiently precise... I imagine there are other integer data types that we would want to sort.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for option 1

/**
* @brief Fill a buffer with a sequence of values
*
Expand All @@ -73,7 +115,7 @@ void scalar_fill(raft::handle_t const& handle, value_t* d_value, size_t size, va
*
* Similar to the function std::iota, wraps the function thrust::sequence
*
* @tparam value_t type of the value to operate on
* @tparam value_t type of the value to operate on.
*
* @param[in] stream_view stream view
* @param[out] d_value device array to fill
Expand Down
406 changes: 378 additions & 28 deletions cpp/include/cugraph/sampling_functions.hpp

Large diffs are not rendered by default.

219 changes: 215 additions & 4 deletions cpp/include/cugraph_c/sampling_algorithms.h
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,13 @@ typedef struct {
int32_t align_;
} cugraph_sampling_options_t;

/**
* @brief Opaque sampling options type
*/
typedef struct {
int32_t align_;
} sampling_flags_t;

/**
* @brief Enumeration for prior sources behavior
*/
Expand Down Expand Up @@ -323,6 +330,8 @@ void cugraph_sampling_options_free(cugraph_sampling_options_t* options);
/**
* @brief Uniform Neighborhood Sampling
*
* @deprecated This API will be deleted, use cugraph_homogeneous_uniform_neighbor_sample
*
* Returns a sample of the neighborhood around specified start vertices. Optionally, each
* start vertex can be associated with a label, allowing the caller to specify multiple batches
* of sampling requests in the same function call - which should improve GPU utilization.
Expand All @@ -348,8 +357,8 @@ void cugraph_sampling_options_free(cugraph_sampling_options_t* options);
* label_to_comm_rank[i]. If not specified then the output data will not be shuffled between ranks.
* @param [in] label_offsets Device array of the offsets for each label in the seed list. This
* parameter is only used with the retain_seeds option.
* @param [in] fanout Host array defining the fan out at each step in the sampling algorithm.
* We only support fanout values of type INT32
* @param [in] fan_out Host array defining the fan out at each step in the sampling
* algorithm. We only support fan_out values of type INT32
* @param [in,out] rng_state State of the random number generator, updated with each call
* @param [in] sampling_options
* Opaque pointer defining the sampling options.
Expand Down Expand Up @@ -378,6 +387,8 @@ cugraph_error_code_t cugraph_uniform_neighbor_sample(
/**
* @brief Biased Neighborhood Sampling
*
* @deprecated This API will be deleted, use cugraph_homogeneous_biased_neighbor_sample.
*
* Returns a sample of the neighborhood around specified start vertices. Optionally, each
* start vertex can be associated with a label, allowing the caller to specify multiple batches
* of sampling requests in the same function call - which should improve GPU utilization.
Expand Down Expand Up @@ -406,8 +417,8 @@ cugraph_error_code_t cugraph_uniform_neighbor_sample(
* label_to_comm_rank[i]. If not specified then the output data will not be shuffled between ranks.
* @param [in] label_offsets Device array of the offsets for each label in the seed list. This
* parameter is only used with the retain_seeds option.
* @param [in] fanout Host array defining the fan out at each step in the sampling algorithm.
* We only support fanout values of type INT32
* @param [in] fan_out Host array defining the fan out at each step in the sampling
* algorithm. We only support fan_out values of type INT32
* @param [in,out] rng_state State of the random number generator, updated with each call
* @param [in] sampling_options
* Opaque pointer defining the sampling options.
Expand All @@ -434,6 +445,186 @@ cugraph_error_code_t cugraph_biased_neighbor_sample(
cugraph_sample_result_t** result,
cugraph_error_t** error);

/**
* @brief Homogeneous Uniform Neighborhood Sampling
*
* Returns a sample of the neighborhood around specified start vertices and fan_out.
* The neighborhood is sampled uniformly.
* Optionally, each start vertex can be associated with a label, allowing the caller to specify
* multiple batches of sampling requests in the same function call - which should improve GPU
* utilization.
*
* If label is NULL then all start vertices will be considered part of the same batch and the
* return value will not have a label column.
*
* @param [in] handle Handle for accessing resources
* * @param [in,out] rng_state State of the random number generator, updated with each call
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage
* needs to be transposed
* @param [in] start_vertices Device array of start vertices for the sampling
* @param [in] starting_vertex_label_offsets Device array of the offsets for each label in
* the seed list. This parameter is only used with the retain_seeds option.
* @param [in] fan_out Host array defining the fan out at each step in the sampling
* algorithm. We only support fan_out values of type INT32
* @param [in] sampling_options
* Opaque pointer defining the sampling options.
* @param [in] do_expensive_check
* A flag to run expensive checks for input arguments (if set to true)
* @param [out] result Output from the uniform_neighbor_sample call
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_homogeneous_uniform_neighbor_sample(
const cugraph_resource_handle_t* handle,
cugraph_rng_state_t* rng_state,
cugraph_graph_t* graph,
const cugraph_type_erased_device_array_view_t* start_vertices,
const cugraph_type_erased_device_array_view_t* starting_vertex_label_offsets,
const cugraph_type_erased_host_array_view_t* fan_out,
const cugraph_sampling_options_t* options,
bool_t do_expensive_check,
cugraph_sample_result_t** result,
cugraph_error_t** error);

/**
* @brief Homogeneous Biased Neighborhood Sampling
*
* Returns a sample of the neighborhood around specified start vertices and fan_out.
* The neighborhood is sampled uniformly.
* Optionally, each start vertex can be associated with a label, allowing the caller to specify
* multiple batches of sampling requests in the same function call - which should improve GPU
* utilization.
*
* If label is NULL then all start vertices will be considered part of the same batch and the
* return value will not have a label column.
*
* @param [in] handle Handle for accessing resources
* * @param [in,out] rng_state State of the random number generator, updated with each call
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage
* needs to be transposed
* @param [in] edge_biases Device array of edge biases to use for sampling. If NULL
* use the edge weight as the bias. If set to NULL, edges will be sampled uniformly.
* @param [in] start_vertices Device array of start vertices for the sampling
* @param [in] starting_vertex_label_offsets Device array of the offsets for each label in
* the seed list. This parameter is only used with the retain_seeds option.
* @param [in] fan_out Host array defining the fan out at each step in the sampling
* algorithm. We only support fan_out values of type INT32
* @param [in] sampling_options
* Opaque pointer defining the sampling options.
* @param [in] do_expensive_check
* A flag to run expensive checks for input arguments (if set to true)
* @param [out] result Output from the uniform_neighbor_sample call
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_homogeneous_biased_neighbor_sample(
const cugraph_resource_handle_t* handle,
cugraph_rng_state_t* rng_state,
cugraph_graph_t* graph,
const cugraph_edge_property_view_t* edge_biases,
const cugraph_type_erased_device_array_view_t* start_vertices,
const cugraph_type_erased_device_array_view_t* starting_vertex_label_offsets,
const cugraph_type_erased_host_array_view_t* fan_out,
const cugraph_sampling_options_t* options,
bool_t do_expensive_check,
cugraph_sample_result_t** result,
cugraph_error_t** error);

/**
* @brief Heterogeneous Uniform Neighborhood Sampling
*
* Returns a sample of the neighborhood around specified start vertices and fan_out.
* The neighborhood is sampled uniformly.
* Optionally, each start vertex can be associated with a label, allowing the caller to specify
* multiple batches of sampling requests in the same function call - which should improve GPU
* utilization.
*
* If label is NULL then all start vertices will be considered part of the same batch and the
* return value will not have a label column.
*
* @param [in] handle Handle for accessing resources
* * @param [in,out] rng_state State of the random number generator, updated with each call
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage
* needs to be transposed
* @param [in] start_vertices Device array of start vertices for the sampling
* @param [in] starting_vertex_label_offsets Device array of the offsets for each label in
* the seed list. This parameter is only used with the retain_seeds option.
* @param [in] fan_out Host array defining the fan out at each step in the sampling
* algorithm. We only support fan_out values of type INT32
* @param [in] num_edge_types Number of edge types where a value of 1 translates to homogeneous
* neighbor sample whereas a value greater than 1 translates to heterogeneous neighbor sample.
* @param [in] sampling_options
* Opaque pointer defining the sampling options.
* @param [in] do_expensive_check
* A flag to run expensive checks for input arguments (if set to true)
* @param [out] result Output from the uniform_neighbor_sample call
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_heterogeneous_uniform_neighbor_sample(
const cugraph_resource_handle_t* handle,
cugraph_rng_state_t* rng_state,
cugraph_graph_t* graph,
const cugraph_type_erased_device_array_view_t* start_vertices,
const cugraph_type_erased_device_array_view_t* starting_vertex_label_offsets,
const cugraph_type_erased_host_array_view_t* fan_out,
int num_edge_types,
const cugraph_sampling_options_t* options,
bool_t do_expensive_check,
cugraph_sample_result_t** result,
cugraph_error_t** error);

/**
* @brief Heterogeneous Biased Neighborhood Sampling
*
* Returns a sample of the neighborhood around specified start vertices and fan_out.
* The neighborhood is sampled uniformly.
* Optionally, each start vertex can be associated with a label, allowing the caller to specify
* multiple batches of sampling requests in the same function call - which should improve GPU
* utilization.
*
* If label is NULL then all start vertices will be considered part of the same batch and the
* return value will not have a label column.
*
* @param [in] handle Handle for accessing resources
* * @param [in,out] rng_state State of the random number generator, updated with each call
* @param [in] graph Pointer to graph. NOTE: Graph might be modified if the storage
* needs to be transposed
* @param [in] edge_biases Device array of edge biases to use for sampling. If NULL
* use the edge weight as the bias. If set to NULL, edges will be sampled uniformly.
* @param [in] start_vertices Device array of start vertices for the sampling
* @param [in] starting_vertex_label_offsets Device array of the offsets for each label in
* the seed list. This parameter is only used with the retain_seeds option.
* @param [in] fan_out Host array defining the fan out at each step in the sampling
* algorithm. We only support fan_out values of type INT32
* @param [in] num_edge_types Number of edge types where a value of 1 translates to homogeneous
* neighbor sample whereas a value greater than 1 translates to heterogeneous neighbor sample.
* @param [in] sampling_options
* Opaque pointer defining the sampling options.
* @param [in] do_expensive_check
* A flag to run expensive checks for input arguments (if set to true)
* @param [out] result Output from the uniform_neighbor_sample call
* @param [out] error Pointer to an error object storing details of any error. Will
* be populated if error code is not CUGRAPH_SUCCESS
* @return error code
*/
cugraph_error_code_t cugraph_heterogeneous_biased_neighbor_sample(
const cugraph_resource_handle_t* handle,
cugraph_rng_state_t* rng_state,
cugraph_graph_t* graph,
const cugraph_edge_property_view_t* edge_biases,
const cugraph_type_erased_device_array_view_t* start_vertices,
const cugraph_type_erased_device_array_view_t* starting_vertex_label_offsets,
const cugraph_type_erased_host_array_view_t* fan_out,
int num_edge_types,
const cugraph_sampling_options_t* options,
bool_t do_expensive_check,
cugraph_sample_result_t** result,
cugraph_error_t** error);
jnke2016 marked this conversation as resolved.
Show resolved Hide resolved

/**
* @deprecated This call should be replaced with cugraph_sample_result_get_majors
* @brief Get the source vertices from the sampling algorithm result
Expand Down Expand Up @@ -584,6 +775,26 @@ cugraph_type_erased_device_array_view_t* cugraph_sample_result_get_renumber_map(
cugraph_type_erased_device_array_view_t* cugraph_sample_result_get_renumber_map_offsets(
const cugraph_sample_result_t* result);

/**
* @ingroup samplingC
* @brief Get the edge renumber map
*
* @param [in] result The result from a sampling algorithm
* @return type erased array pointing to the renumber map
*/
cugraph_type_erased_device_array_view_t* cugraph_sample_result_get_edge_renumber_map(
const cugraph_sample_result_t* result);

/**
* @ingroup samplingC
* @brief Get the edge renumber map offets
*
* @param [in] result The result from a sampling algorithm
* @return type erased array pointing to the renumber map
*/
cugraph_type_erased_device_array_view_t* cugraph_sample_result_get_edge_renumber_map_offsets(
const cugraph_sample_result_t* result);

/**
* @ingroup samplingC
* @brief Free a sampling result
Expand Down
23 changes: 22 additions & 1 deletion cpp/src/c_api/array.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2021-2023, NVIDIA CORPORATION.
* Copyright (c) 2021-2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -125,6 +125,27 @@ struct cugraph_type_erased_host_array_t {
std::copy(vec.begin(), vec.end(), reinterpret_cast<T*>(data_.get()));
}

cugraph_type_erased_host_array_t(cugraph_type_erased_host_array_view_t const* view_p)
: data_(std::make_unique<std::byte[]>(view_p->num_bytes_)),
size_(view_p->size_),
num_bytes_(view_p->num_bytes_),
type_(view_p->type_)
{
std::copy(view_p->data_, view_p->data_ + num_bytes_, data_.get());
}

template <typename T>
T* as_type()
{
return reinterpret_cast<T*>(data_.get());
}

template <typename T>
T const* as_type() const
{
return reinterpret_cast<T const*>(data_.get());
}

auto view()
{
return new cugraph_type_erased_host_array_view_t{data_.get(), size_, num_bytes_, type_};
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/c_api/graph_functions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ struct create_vertex_pairs_functor : public cugraph::c_api::abstract_functor {
std::nullopt,
std::nullopt);
}

// FIXME: use std::tuple (template) instead.
result_ = new cugraph::c_api::cugraph_vertex_pairs_t{
new cugraph::c_api::cugraph_type_erased_device_array_t(first_copy, graph_->vertex_type_),
new cugraph::c_api::cugraph_type_erased_device_array_t(second_copy, graph_->vertex_type_)};
Expand Down
Loading
Loading