Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Commit

Permalink
Add in-place guarantees for segmented reduce
Browse files Browse the repository at this point in the history
  • Loading branch information
gevtushenko committed Jun 25, 2022
1 parent ecf6e76 commit 5732130
Showing 1 changed file with 34 additions and 9 deletions.
43 changes: 34 additions & 9 deletions cub/device/device_segmented_reduce.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,11 @@ struct DeviceSegmentedReduce
* `segment_offsets` (of length `num_segments + 1`) can be aliased
* for both the `d_begin_offsets` and `d_end_offsets` parameters (where
* the latter is specified as `segment_offsets + 1`).
* - Let `s` be in `[0, num_segments)`. The range
* `[d_out + d_begin_offsets[s], d_out + d_end_offsets[s])` shall not
* overlap `[d_in + d_begin_offsets[s], d_in + d_end_offsets[s])`,
* `[d_begin_offsets, d_begin_offsets + num_segments)` nor
* `[d_end_offsets, d_end_offsets + num_segments)`.
* - @devicestorage
*
* @par Snippet
Expand Down Expand Up @@ -245,6 +250,11 @@ struct DeviceSegmentedReduce
* for both the `d_begin_offsets` and `d_end_offsets` parameters (where
* the latter is specified as `segment_offsets + 1`).
* - Does not support `+` operators that are non-commutative.
* - Let `s` be in `[0, num_segments)`. The range
* `[d_out + d_begin_offsets[s], d_out + d_end_offsets[s])` shall not
* overlap `[d_in + d_begin_offsets[s], d_in + d_end_offsets[s])`,
* `[d_begin_offsets, d_begin_offsets + num_segments)` nor
* `[d_end_offsets, d_end_offsets + num_segments)`.
* - @devicestorage
*
* @par Snippet
Expand Down Expand Up @@ -390,6 +400,11 @@ struct DeviceSegmentedReduce
* the `d_begin_offsets` and `d_end_offsets` parameters (where the latter is
* specified as `segment_offsets + 1`).
* - Does not support `<` operators that are non-commutative.
* - Let `s` be in `[0, num_segments)`. The range
* `[d_out + d_begin_offsets[s], d_out + d_end_offsets[s])` shall not
* overlap `[d_in + d_begin_offsets[s], d_in + d_end_offsets[s])`,
* `[d_begin_offsets, d_begin_offsets + num_segments)` nor
* `[d_end_offsets, d_end_offsets + num_segments)`.
* - @devicestorage
*
* @par Snippet
Expand Down Expand Up @@ -542,6 +557,11 @@ struct DeviceSegmentedReduce
* the `d_begin_offsets` and `d_end_offsets` parameters (where the latter
* is specified as `segment_offsets + 1`).
* - Does not support `<` operators that are non-commutative.
* - Let `s` be in `[0, num_segments)`. The range
* `[d_out + d_begin_offsets[s], d_out + d_end_offsets[s])` shall not
* overlap `[d_in + d_begin_offsets[s], d_in + d_end_offsets[s])`,
* `[d_begin_offsets, d_begin_offsets + num_segments)` nor
* `[d_end_offsets, d_end_offsets + num_segments)`.
* - @devicestorage
*
* @par Snippet
Expand Down Expand Up @@ -705,6 +725,11 @@ struct DeviceSegmentedReduce
* for both the `d_begin_offsets` and `d_end_offsets` parameters (where
* the latter is specified as `segment_offsets + 1`).
* - Does not support `>` operators that are non-commutative.
* - Let `s` be in `[0, num_segments)`. The range
* `[d_out + d_begin_offsets[s], d_out + d_end_offsets[s])` shall not
* overlap `[d_in + d_begin_offsets[s], d_in + d_end_offsets[s])`,
* `[d_begin_offsets, d_begin_offsets + num_segments)` nor
* `[d_end_offsets, d_end_offsets + num_segments)`.
* - @devicestorage
*
* @par Snippet
Expand Down Expand Up @@ -845,7 +870,7 @@ struct DeviceSegmentedReduce
* that item
*
* @par
* - The output value type of `d_out` is cub::KeyValuePair `<int, T>`
* - The output value type of `d_out` is `cub::KeyValuePair<int, T>`
* (assuming the value type of `d_in` is `T`)
* - The maximum of the *i*<sup>th</sup> segment is written to
* `d_out[i].value` and its offset in that segment is written to
Expand All @@ -857,6 +882,11 @@ struct DeviceSegmentedReduce
* for both the `d_begin_offsets` and `d_end_offsets` parameters (where
* the latter is specified as `segment_offsets + 1`).
* - Does not support `>` operators that are non-commutative.
* - Let `s` be in `[0, num_segments)`. The range
* `[d_out + d_begin_offsets[s], d_out + d_end_offsets[s])` shall not
* overlap `[d_in + d_begin_offsets[s], d_in + d_end_offsets[s])`,
* `[d_begin_offsets, d_begin_offsets + num_segments)` nor
* `[d_end_offsets, d_end_offsets + num_segments)`.
* - @devicestorage
*
* @par Snippet
Expand Down Expand Up @@ -983,14 +1013,9 @@ struct DeviceSegmentedReduce

ArgIndexInputIteratorT d_indexed_in(d_in);

// Initial value
OutputTupleT initial_value(1, Traits<InputValueT>::Lowest()); // replace
// with
// std::numeric_limits<T>::lowest()
// when C++11
// support is
// more
// prevalent
// Initial value, replace with std::numeric_limits<T>::lowest() when C++11
// support is more prevalent
OutputTupleT initial_value(1, Traits<InputValueT>::Lowest());

return DispatchSegmentedReduce<ArgIndexInputIteratorT,
OutputIteratorT,
Expand Down

0 comments on commit 5732130

Please sign in to comment.