-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor bit counting APIs, introduce valid/null count functions, and split host/device side code for segmented counts. #9588
Refactor bit counting APIs, introduce valid/null count functions, and split host/device side code for segmented counts. #9588
Conversation
For clarity's sake, your suggestion is to change offsets to |
I'm trying to understand why we'd need support for Supporting overlapping ranges might be useful to speed up |
I meant the opposite -- I hoped to change this specific API for |
I bet slice could be changed as well. Although that's a pretty delicate function to break :) For what it's worth, I'm all for changing the interface to use offsets in the more standard sense. |
What we should really do is make the implementation take iterators for the start/stop points. You can generate those from either 2N or N+1 offsets with transform iterators. |
To clarify, you mean that the implementation would take two independent iterators as arguments: an iterator of "start" indices and an iterator of "end" indices? Such that (very loosely speaking) the |
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #9588 +/- ##
================================================
- Coverage 10.49% 10.42% -0.07%
================================================
Files 119 119
Lines 20305 20479 +174
================================================
+ Hits 2130 2134 +4
- Misses 18175 18345 +170
Continue to review full report at Codecov.
|
We definitely cannot change
Yes, that would match CUB segmented reduce.
As is, this PR will preclude using iterators for the start/end indices by moving everything to a |
@jrhemstad Good points. I'll think some more on this and try to come up with a better approach. |
Co-authored-by: Jake Hemstad <[email protected]>
44674ea
to
bbd045f
Compare
|
||
stream.synchronize(); // now ret is valid. | ||
struct index_alternator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are at least three places in this PR where the code would benefit from a strided iterator (input validation in validate_segmented_indices
, constructing first/last index iterators from the iterator of segmented indices in segmented_count_bits
, and constructing a vector of segment lengths in segmented_valid_count
). I would like to work on https://github.com/NVIDIA/thrust/issues/912 in the near future. I might introduce a strided iterator into cudf/detail/iterator.cuh
to test the design before working on a Thrust implementation.
segmented_count_set_bits
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor nitpicks. LGTM!
@gpucibot merge |
Fixes #9164. ### Prelude `lists::contains()` (introduced in #7039) returns a `BOOL8` column, indicating whether the specified search_key(s) exist at all in each corresponding list row of an input LIST column. It does not return the actual position. ### `index_of()` This commit introduces `lists::index_of()`, to return the INT32 positions of the specified search_key(s) in a LIST column. The search keys may be searched for using either `FIND_FIRST` (which finds the position of the first occurrence), or `FIND_LAST` (which finds the last occurrence). Both column_view and scalar search keys are supported. As with `lists::contains()`, nested types are not supported as search keys in `lists::index_of()`. If the search_key cannot be found, that output row is set to `-1`. Additionally, the row `output[i]` is set to null if: 1. The `search_key`(scalar) or `search_keys[i]`(column_view) is null. 2. The list row `lists[i]` is null In all other cases, `output[i]` should contain a non-negative value. ### Semantic changes for `lists::contains()` This commit also modifies the semantics of `lists::contains()`: it will now return nulls only for the following cases: 1. The `search_key`(scalar) or `search_keys[i]`(column_view) is null. 2. The list row `lists[i]` is null In all other cases, a non-null bool is returned. Specifically `lists::contains()` no longer conforms to SQL semantics of returning `NULL` for list rows that don't contain the search key, while simultaneously containing nulls. In this case, `false` is returned. ### `lists::contains_null_elements()` A new function has been introduced to check if each list row contains null elements. The semantics are similar to `lists::contains()`, in that the column returned is BOOL8 typed: 1. If even 1 element in a list row is null, the returned row is `true`. 2. If no element is null, the returned row is `false`. 3. If the list row is null, the returned row is `null`. 4. If the list row is empty, the returned row is `false`. The current implementation is an inefficient placeholder, to be replaced once (#9588) is available. It is included here to reconstruct the SQL semantics dropped from `lists::contains()`. Authors: - MithunR (https://github.com/mythrocks) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Jason Lowe (https://github.com/jlowe) - Mark Harris (https://github.com/harrism) - Conor Hoekstra (https://github.com/codereport) URL: #9510
This PR is a first step towards #9552. I split the host and device code for
segmented_count_set_bits
, so that APIs with existing indices on the device can call the device function directly. Along the way, I discussed some pain points in the current design of bit counting functions with @jrhemstad and @mythrocks and made the following changes:Fixed inconsistencies in the behavior of bit counting functions (
count_set_bits
returned zero whenbitmask==nullptr
, butsegmented_count_set_bits
returned segment lengths) by raising an error whenbitmask==nullptr
(breaking change)Moved all bit counting functions to detail namespaces (breaking change)
Separated "validity" logic from pure bit counting through the introduction of new
valid_count
,null_count
,segmented_valid_count
,segmented_null_count
detail functions. These functions treat all elements as valid whenbitmask==nullptr
and otherwise return the same way as the corresponding bit counting functions.Split device-side code from host-side code to enable future work on segmented nullmask reductions over lists, with indices (offsets) already on the device.
Updated all calling code (and added tests) to use the new valid/null count detail functions.