-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: struct field don't push down to TableScan #8774
Conversation
@@ -149,7 +149,7 @@ select count(*) from m1 where tag_id = '1000' and time < '2024-01-03T14:46:35+01 | |||
10 | |||
|
|||
query RRR | |||
select min(f5), max(f5), avg(f5) from m2 where tag_id = '1000' and time < '2024-01-03T14:46:35+01:00' group by type; | |||
select min(f5), max(f5), avg(f5) from m2 where tag_id = '1000' and time < '2024-01-03T14:46:35+01:00' group by type order by min(f5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just make result stable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A different fix is here: #8769
the root reason maybe in below code I'm not sure why we handle it this way. I tried commenting out this line, and all the tests still passed(the only test don't pass is not related to this, and I comment above in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @haohuaijin -- given this PR fixes the bug and includes good test coverage ❤️ I think we could merge it as is.
However, I had some suggestions to improve the code even more -- do you think we should try to do those as part of this PR or maybe as a follow on PR?
| Expr::IsNotUnknown(expr) | ||
| Expr::IsNotNull(expr) | ||
| Expr::IsNull(expr) | ||
| Expr::Negative(expr) => outer_columns_helper(expr, columns), | ||
Expr::Column(_) | Expr::Literal(_) | Expr::Wildcard { .. } => true, | ||
_ => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you think about removing this catchall (and and thus explicitly enumerating all Expr types so this type of bug can't happen again)?
_ => false, |
An alternate idea would be to remove this explicit walk down the tree entirely and use https://docs.rs/datafusion/latest/datafusion/logical_expr/utils/fn.expr_to_columns.html instead?
This pushdown code may predate the more generaic tree walks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Already removed the catchall.
An alternate idea would be to remove this explicit walk down the tree entirely and use https://docs.rs/datafusion/latest/datafusion/logical_expr/utils/fn.expr_to_columns.html instead?
the outer_columns_helper
function is used to collect all outer_ref_column
that expr_to_columns
function doesn't return these columns. Because expr_to_columns
use TreeNode
to walks the expr tree, but TreeNode don't walk the outer_ref_column
https://github.com/apache/arrow-datafusion/blob/dd4263f843e093c807d63edf73a571b1ba2669b5/datafusion/expr/src/tree_node/expr.rs#L69-L77
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the outer_columns_helper function is used to collect all outer_ref_column that expr_to_columns function doesn't return these columns. Because expr_to_columns use TreeNode to walks the expr tree, but TreeNode don't walk the outer_ref_column
I see -- thank you for the explanation, which makes sense. It fiddled a little with the code, and I think I have a cleanup here that uses the TreeNode
walking to avoid having to enumerate Expr
variants #8787
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @haohuaijin
| Expr::IsNotNull(expr) | ||
| Expr::IsNull(expr) | ||
| Expr::Negative(expr) => outer_columns_helper(expr, columns), | ||
Expr::Column(_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Which issue does this PR close?
Closes #8735
Rationale for this change
when debug #8735, I find not only the case in #8735 don't push down, we have other case don't push down
for example
What changes are included in this PR?
add more case to
outer_columns_helper
.Are these changes tested?
yes, add more tests
Are there any user-facing changes?