Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow for more RG skipping by rewriting expr in planner #20828

Merged
merged 26 commits into from
Jan 30, 2025

Conversation

coastalwhite
Copy link
Collaborator

This is an drafting PR to show the idea of rewriting the predicate expression in the planner to an expression that can be used to skip record batches (row groups in parquet).

At the moment this is probably already close to what the existing implementation can do even with some more to show that it is better (by implementing Expr.is_in).

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Jan 21, 2025
@coastalwhite
Copy link
Collaborator Author

This is now basically done, only a few more things need to happen:

  • Port changes to the new and old streaming engine
  • Achieve full feature parity with the old statistics evaluator.

@coastalwhite coastalwhite force-pushed the feat/pred-to-skip-batches-expr branch 2 times, most recently from 1716ad7 to 0e6f163 Compare January 23, 2025 15:05
@coastalwhite coastalwhite marked this pull request as ready for review January 24, 2025 11:05
@coastalwhite coastalwhite force-pushed the feat/pred-to-skip-batches-expr branch from 52aa4a4 to 6304cb1 Compare January 24, 2025 15:49
This is an drafting PR to show the idea of rewriting the predicate expression
in the planner to an expression that can be used to skip record batches (row
groups in parquet).

At the moment this is probably already close to what the existing
implementation can do even with some more to show that it is better (by
implementing `Expr.is_in`).

[skip ci]
@coastalwhite coastalwhite force-pushed the feat/pred-to-skip-batches-expr branch from b60cbc8 to 1f79264 Compare January 27, 2025 08:55
Copy link

codecov bot commented Jan 27, 2025

Codecov Report

Attention: Patch coverage is 62.42991% with 201 lines in your changes missing coverage. Please review.

Project coverage is 79.29%. Comparing base (176268e) to head (f67b6e4).

Files with missing lines Patch % Lines
crates/polars-mem-engine/src/predicate.rs 49.19% 95 Missing ⚠️
...polars-mem-engine/src/executors/multi_file_scan.rs 0.00% 54 Missing ⚠️
...polars-stream/src/nodes/io_sources/parquet/init.rs 0.00% 7 Missing ⚠️
crates/polars-io/src/parquet/read/read_impl.rs 78.57% 6 Missing ⚠️
crates/polars-python/src/cloud.rs 0.00% 6 Missing ⚠️
...m/src/nodes/io_sources/parquet/row_group_decode.rs 0.00% 5 Missing ⚠️
crates/polars-core/src/scalar/mod.rs 42.85% 4 Missing ⚠️
crates/polars-mem-engine/src/executors/scan/csv.rs 57.14% 3 Missing ⚠️
...es/polars-mem-engine/src/executors/scan/parquet.rs 84.21% 3 Missing ⚠️
crates/polars-plan/src/plans/lit.rs 57.14% 3 Missing ⚠️
... and 8 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #20828      +/-   ##
==========================================
- Coverage   79.30%   79.29%   -0.01%     
==========================================
  Files        1578     1581       +3     
  Lines      224109   224288     +179     
  Branches     2574     2574              
==========================================
+ Hits       177728   177856     +128     
- Misses      45790    45841      +51     
  Partials      591      591              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coastalwhite coastalwhite merged commit 2eaee18 into pola-rs:main Jan 30, 2025
26 of 27 checks passed
@coastalwhite coastalwhite deleted the feat/pred-to-skip-batches-expr branch January 30, 2025 09:17
@c-peters c-peters added the accepted Ready for implementation label Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants