-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: common_subexpr_eliminate rule should not apply to short-circuit expression #8928
Conversation
@@ -1129,5 +1129,35 @@ FROM t AS A, (SELECT * FROM t WHERE x = 0) AS B; | |||
0 0 | |||
0 0 | |||
|
|||
query TT | |||
explain select coalesce(1, y/x), coalesce(2, y/x) from t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because of the reason describe in #8927, use plan to test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please add some comments here explaining the rationale and what the expected outputs are (so future readers know if changes are expected)
Something like this perhaps:
# Expressions that short circuit should not be refactored out as that may cause side effects (divide by zero)
# at plan time that would not actually happen during execution
Also, can you please add the tests that now pass (e.g. select coalesce(1, y/x), coalesce(2, y/x) from t;
) so if someone breaks this code by accident, those queries would start failing, which might be easier to quickly tell is incorrect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR @haohuaijin -- the code looks good to me 🙏
I also took a look at https://docs.rs/datafusion/latest/datafusion/logical_expr/enum.BuiltinScalarFunction.html# and https://docs.rs/datafusion/latest/datafusion/logical_expr/expr/enum.Expr.html and didn't see any additional functions that shoudl be marked as short circuiting.
I had a code structure suggestion that I don't think is needed (though I think it would improve this PR).
I think a few more tests that actually run (and don't error) should be added prior to merge.
Thanks again!
@@ -1129,5 +1129,35 @@ FROM t AS A, (SELECT * FROM t WHERE x = 0) AS B; | |||
0 0 | |||
0 0 | |||
|
|||
query TT | |||
explain select coalesce(1, y/x), coalesce(2, y/x) from t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please add some comments here explaining the rationale and what the expected outputs are (so future readers know if changes are expected)
Something like this perhaps:
# Expressions that short circuit should not be refactored out as that may cause side effects (divide by zero)
# at plan time that would not actually happen during execution
Also, can you please add the tests that now pass (e.g. select coalesce(1, y/x), coalesce(2, y/x) from t;
) so if someone breaks this code by accident, those queries would start failing, which might be easier to quickly tell is incorrect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @haohuaijin @alamb .
@alamb 's review suggestions are already very detailed, and I agree with his opinion. After those impromentent, we can merge this PR
Thanks @alamb and @jackwener's review and great suggestions. I apply those suggestions, but I'm unsure if the test I added is appropriate(560cb6c). I also add describe in #8928 (comment), to show what the change in this pr. |
ProjectionExec: expr=[y@1 = 0 OR 1 / CAST(y@1 AS Int64) < 1 as t.y = Int64(0) OR Int64(1) / t.y < Int64(1), x@0 = 0 OR y@1 = 0 OR 1 / CAST(y@1 AS Int64) < 1 / CAST(x@0 AS Int64) as t.x = Int64(0) OR t.y = Int64(0) OR Int64(1) / t.y < Int64(1) / t.x] | ||
--MemoryExec: partitions=1, partition_sizes=[1] | ||
|
||
# due to the reason describe in https://github.com/apache/arrow-datafusion/issues/8927, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure if those tests are appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's related with #8910
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @haohuaijin
matches!(op, Operator::And | Operator::Or) | ||
} | ||
Expr::Case { .. } => true, | ||
// Use explicit pattern match instead of a default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thank you @haohuaijin and @jackwener -- let's merge this PR and we can work on #8910 as a follow on |
…expression (apache#8928) * fix: common_subexpr_eliminate rule should not apply to short-circuit expression * add more tests * format * minor * apply reviews * add some commont * fmt
… to v34 (#227) * fix: don't extract common sub expr in `CASE WHEN` clause (apache#8833) * fix: don't extract common sub expr in CASE WHEN clause * fix ci * fix * fix: common_subexpr_eliminate rule should not apply to short-circuit expression (apache#8928) * fix: common_subexpr_eliminate rule should not apply to short-circuit expression * add more tests * format * minor * apply reviews * add some commont * fmt --------- Co-authored-by: Huaijin <[email protected]>
Which issue does this PR close?
Closes #8874
Rationale for this change
see #8874
What changes are included in this PR?
before this pr
after this pr
But due to the reason describe in #8927, the above two query will get same result(get drive by zero error).
Are these changes tested?
yes, add some negative test
Are there any user-facing changes?