-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC Make BloomFilter application general, add PruningPredicate::contains
#8397
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Rewrite BloomFilterPruningPredicate in terms of BloomFilterPruningPredicate
alamb
force-pushed
the
alamb/general_bloom_filter
branch
from
December 4, 2023 21:28
298da7f
to
76ee618
Compare
This was referenced Dec 6, 2023
The pieces of this PR are in, so closing this PR and focusing on merging the parts |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
POC for #8376
Rationale for this change
See #8376
TLDR is I would like to use
PruningPredicate
for "bloom filter like" operations downstream in IOx, and thus I would like to refactor the code that applies Bloom Filters today into the PruningPredicate.What changes are included in this PR?
PruningStatistics::contains
API proposed in Support general pruning based on<col> = 'const'
inPruningPredicate
#8376LiteralGuarantee
for extractingcol = <const>
predicates, along with many new testsBloomFilter
code to use the new predicate APITODO
IN
andNOT IN
exprs. SupportIN
lists with more than three constants in predicates for bloom filters #8436Are these changes tested?
IN
andNOT IN
)Are there any user-facing changes?
The PruningPredicate has a new API
Plans for merging
I do not plan to propose merging this PR as is. Instead I plan a series of PRs:
contains
API with tests🤔 musing: At the moment, the row group pruning happens in two phases -- first the min/max statistics are applied, and then the bloom filters are applied which means the potentially bloom filters are only fetched for groups that passed the first predicate.