-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Enhance case expression type coercion #5820
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Jefffrey -- the code looks good to me
Prior to merge I think some additional tests are needed in this PR.
I also left some small code style suggestions, but I don't think they are required.
// ELSE b3 | ||
// END | ||
// | ||
// Then all aN (a1, a2, a3) must be converted to a common data type in the first example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
// prepare types | ||
let case_type = match &case.expr { | ||
None => Ok(None), | ||
Some(expr) => expr.get_type(&self.schema).map(Some), | ||
}?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can write this more concisely / functionally using transpose
like this if you want
// prepare types | |
let case_type = match &case.expr { | |
None => Ok(None), | |
Some(expr) => expr.get_type(&self.schema).map(Some), | |
}?; | |
// prepare types | |
let case_type = case.expr.as_ref() | |
.map(|expr| expr.get_type(&self.schema)) | |
.transpose()?; |
The same basic transformation applies to the other match
statements below -- I don't think it is a big deal I just figured I would point it out to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks that does look cleaner, will apply it
# select case when type coercion | ||
query I | ||
select CASE 10.5 WHEN 0 THEN 1 ELSE 10 END as col; | ||
---- | ||
10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some additional tests?
- Alternate Syntax:
select CASE
WHEN 10 = 5 THEN 1
WHEN 'true' THEN 1
ELSE 10
END as col;
Also negative cases like
select CASE 10.5 WHEN 'not a number' THEN 1 ELSE 10 END as col;
select CASE
WHEN 10 = 5 THEN 1
WHEN 'not boolean' THEN 1
ELSE 10
END as col;
select CASE
WHEN 10 = 5 THEN 1
WHEN 5 THEN 'not a number'
ELSE 10
END as col;
select CASE
WHEN 10 = 5 THEN 1
WHEN 5 THEN 0
ELSE 'goo'
END as col;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way in sqllogictests to change the config of DataFusion? Since for negative cases to produce error, would need to set datafusion.optimizer.skip_failed_rules
to false, unless you mean for the negative case to fail as in the original issue?
Otherwise can add the negative case as a unit test in actual code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can use the SET
command to change config values. For example https://github.com/apache/arrow-datafusion/blob/7b90474adf6c1afbb4a5988a5677834b07347321/datafusion/core/tests/sqllogictests/test_files/information_schema.slt#L30
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing with sqllogictests turned out to be kinda finicky, as this PR is meant to enhance the type coercion, but not actually testing the actual casting itself (since some of those negative cases would pass the type coercion but fail at execution time due to actual contents of the data). So I figured it'd be easier to add more tests as unit tests for proper testing.
I've found another bug related to case expression casting, will raise issue and also fix as part of this PR since will be affecting similar code Edit: the bug #5828 Edit2: fix added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just think we need a few more tests and this PR will be ready to go
let when_value = as_boolean_array(&when_value) | ||
.expect("WHEN expression did not return a BooleanArray"); | ||
let when_value = as_boolean_array(&when_value).map_err(|e| { | ||
DataFusionError::Context( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah figured better to wrap in Context to hopefully provide more informative error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me -- thank you @Jefffrey
.map(|expr| expr.get_type(&schema)) | ||
.transpose()?; | ||
|
||
// find common coercible types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this code really easy to read. Thank you @Jefffrey
Which issue does this PR close?
Closes #5538
Closes #5828
Rationale for this change
Given following:
Should attempt to coerce
a
,b
andc
to common type.And for:
Should coerce
a
andb
to boolean type.What changes are included in this PR?
Ensure
case
andwhen
expressions are coerced to common type, similar to how currentlythen
andelse
expressions are coerced to common type.Also ensure
when
expressions for whencase
expression isn't provided to coerce to boolean type.Are these changes tested?
Added to sqllogictest
Are there any user-facing changes?