-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specialize filter kernel for binary arrays (#2969) #2971
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐎 very nice @tustvold
Just to be clear, I think this adds the ability to filter Binary and LargeBinary arrays and we already had specialized kernels for Utf8 and
LargeUtf8`
@@ -626,17 +626,17 @@ where | |||
/// | |||
/// Note: NULLs with a non-zero slot length in `array` will have the corresponding | |||
/// data copied across. This allows handling the null mask separately from the data | |||
fn filter_string<OffsetSize>( | |||
array: &GenericStringArray<OffsetSize>, | |||
fn filter_bytes<T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment a few lines above is now out of date:
///
filter
implementation for string arrays
Should be:
///
filter
implementation for byte arrays
@@ -626,17 +626,17 @@ where | |||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not shown in the diff, but the comment is out-of-dated:
/// `filter` implementation for string arrays
Benchmark runs are scheduled for baseline = c7f97c2 and contender = 62e878e. 62e878e is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Part of #2969
Rationale for this change
Should make filtering byte arrays significantly faster, we actually lacked a specialized implementation for them
What changes are included in this PR?
Are there any user-facing changes?