-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Boolean Coercion #8331
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jayzhan211 -- this is looking pretty close. I left some suggestions -- let me know what you think
@@ -353,6 +354,20 @@ fn string_temporal_coercion( | |||
} | |||
} | |||
|
|||
/// Coerce `Boolean` to other larger types, like Numeric as `1` or String as "1" | |||
fn bool_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> { | |||
match (lhs_type, rhs_type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this also needs to check when rhs_type
is a DataType::Boolean
as well.
I would expect both of the following queries to work and return the same thing
sDataFusion CLI v33.0.0
❯ select 1 = true;
Error during planning: Cannot infer common argument type for comparison operation Int64 = Boolean
❯ select true = 1;
+--------------------------+
| Boolean(true) = Int64(1) |
+--------------------------+
| true |
+--------------------------+
1 row in set. Query took 0.006 seconds.
select make_array(1, 2, 3), make_array(1.0, 2.0, 3.0), make_array('h', 'e', 'l', 'l', 'o'); | ||
---- | ||
[1, 2, 3] [1.0, 2.0, 3.0] [h, e, l, l, o] | ||
query ?????? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add some tests for other uses of this logic (not just in make_array) such as comparisons
I notice that postgres doesn't handle boolean coercion
postgres=# select true = 1;
ERROR: operator does not exist: boolean = integer
LINE 1: select true = 1;
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
However, after this PR datafusion does:
DataFusion CLI v33.0.0
❯ select true = 1;
+--------------------------+
| Boolean(true) = Int64(1) |
+--------------------------+
| true |
+--------------------------+
1 row in set. Query took 0.006 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duckdb support boolean coercion, we can consider follow it. If it does not break the overall design, I think we can support it too.
@@ -828,7 +828,8 @@ mod tests { | |||
None, | |||
schema.as_ref(), | |||
); | |||
assert!(expr.is_err()); | |||
// expr is ok since boolean coercion is supported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we lost the negative coverage here -- can. you please change the test back to checking that this is an error?
Perhaps you could change it to
CASE WHEN a = 'foo' THEN timestamp(500ns) WHEN a = 'bar' THEN true END
As bool --> timestamp coercion isn't supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jayzhan211 -- this is looking very close. I think the code looks 👨🍳 👌 and we just need a few more changes to the tests to avoid losing negative coverage
Thanks for pushing this
|
||
# bool_coercion | ||
query BBBB | ||
select 1 == true, false == 0, 1.0 == true, false == 'false'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
0c0a4be
to
c14820d
Compare
@@ -799,69 +799,6 @@ mod tests { | |||
Ok(batch) | |||
} | |||
|
|||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to slt
@@ -2190,18 +2190,6 @@ fn union_with_aliases() { | |||
quick_test(sql, expected); | |||
} | |||
|
|||
#[test] | |||
fn union_with_incompatible_data_types() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to slt
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @jayzhan211
@Dandandan / @andygrove / @viirya any concerns with supporting coercion from bool to other types? It seems like a good feature to me
such as:
I suggest to we'd better to keep same behavior with |
I agree it causes some confusing behavior. true is coerced to 1 or 'true' depends on other type.
Then, why do DuckDB introduce this 🤔 |
Per jakevin, let's not merge this until we sort out if it is a good idea
Which issue does this PR close?
Ref #8302
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?