-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep track of common sub-expression across logical plan nodes #9576
Comments
I can do this one |
One can use following query to generate table
|
Hey @mustafasrepo, the reason that the first plan added a new Projection is that in the Rewriter it would mark the c3+c4 twice so that it judges the expressions needed to add an extra Projection layer. However, here I got two Problem and wish you could give me an answer. Currently, it seems like I have two ways to implement this feature
|
The current approach is we use projection to calculate a complex expression if it is used at least twice (Otherwise projection deemed unnecessary). Hence, first option wouldn't work in this case. The other approach may work, however, it may place the projection in a sub-optimal spot. As an example, consider following plan,
with second approach you might produce plan below (still better than current behaviour. However, sub-optimal)
where I used `a+b` to distinguish it from binary expression
Hence, I think best approach is to traverse plan from top to bottom and keeping the cumulative complex expression counts in the plan. For plan below
This would produce
Then after constructing, above tree. With a bottom-up traversal we can generate following plan
by implacing projections to calculate common expression that are used more than once by subsequent stages. However, I presume this would involve a lot of work. |
I got it, Thanks for your solutions. I plan to implement this today. |
Is your feature request related to a problem or challenge?
No response
Describe the solution you'd like
Currently, common
CommonSubexprEliminate
LogicalPlan
optimizer rule analyzes common sub-expressions in a query. Then caches, common sub-expression by adding aLogicalPlan::Projection
if it thinks this is beneficial.As an example, following query
generates following
LogicalPlan
:where
t.c3+t.c4
is calculated once in theProjection
then referred by subsequentWindowAggr
as a column.However, following query:
generates following
LogicalPlan
:instead we could generate following plan:
If were to keep track of common sub expression counts globally across different nodes in the
LogicalPlan
. This will enable us to generate betterLogicalPlan
s.Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: