-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize nested unions #7481
Comments
Additionally, a union of 1 item could always just drop the Union flowchart TD
A[union] --> B[scan]
flowchart TD
B[scan]
|
@crepererum actually implemented these two physical optimizations in IOx: https://github.com/influxdata/influxdb_iox/tree/main/iox_query/src/physical_optimizer/union It is probably a fairly straightforward exercise to copy them (and their tests) into DataFusion Thus marking this as a good first issue as the code already exists |
@crepererum also notes that this could be a logical optimizer pass or a physical optimizer pass. IOx implemented it as a physical pass. |
Hi, I can do it. @universalmind303 can we create unary union after parsing sql query? I think that @alamb this logical optimization already exists in |
@maruschin -- I think that sounds like an excellent idea |
@alamb Everything is done, please look at the PR. |
Is your feature request related to a problem or challenge?
if your query contains many nested unions, it could result in an inefficient plan. If it is a union of unions, we can easily simplify that to a single union node.
Describe the solution you'd like
nested union nodes should be rewritten as a single union node
Describe alternatives you've considered
None come to mind.
Additional context
For reference, polars optimizes these away.
pola-rs/polars#7855
pola-rs/polars#7861
https://github.com/influxdata/influxdb_iox/issues/7412
The text was updated successfully, but these errors were encountered: