-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IS [NOT] DISTINCT FROM to SQL and join matchers. #14976
Conversation
Changes: 1) Add "isdistinctfrom" and "notdistinctfrom" native expressions. 2) Add "IS [NOT] DISTINCT FROM" to SQL. It uses the new native expressions when generating expressions, and is treated the same as equals and not-equals when generating native filters on literals. 3) Update join matchers to have an "includeNull" parameter that determines whether we are operating in "equals" mode or "is not distinct from" mode.
* SQL function "x IS NOT DISTINCT FROM y". Very similar to "x = y", i.e. {@link BinEqExpr}, except this function | ||
* never returns null, and this function considers NULL itself to be not-distinct-from NULL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function considers NULL itself to be not-distinct-from NULL
- this part is a bit unclear. How about
this function considers NULL as a value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe adding an example helps too. I changed it to this:
/**
* SQL function "x IS NOT DISTINCT FROM y". Very similar to "x = y", i.e. {@link BinEqExpr}, except this function
* never returns null, and this function considers NULL as a value, so NULL itself is not-distinct-from NULL. For
* example: `x == null` returns `null` in SQL-compatible null handling mode, but `notdistinctfrom(x, null)` is
* true if `x` is null.
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good.
@Override | ||
public String name() | ||
{ | ||
return "notdistinctfrom"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return "notdistinctfrom"; | |
return "is_not_distinct_from"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh, it is consistent with isnull
/notnull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's consistent with the SQL expression though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was modeling the names after isnull
and notnull
. Like,
- SQL
IS NULL
-> nativeisnull
- SQL
IS NOT NULL
-> nativenotnull
(drop the "is")
I would prefer to keep this whole family of functions consistent with each other. IMO if we also want consistency with SQL then we should make new aliases for isnull
and notnull
too. That'd be fine by me, fwiw, although I don't personally feel it's necessary.
if (leftVal.isNumericNull() || rightVal.isNumericNull()) { | ||
return ExprEval.ofLongBoolean(leftVal.isNumericNull() && rightVal.isNumericNull()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious why this wasn't done if the comparison type was Long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it doesn't really matter because its covered by the leftVal.value() == null || rightVal.value == null check that happens before we get here.
This default case is also handling array and complex types, which .. is probably not super cool since the former always claims to be numeric null unless it contains only a single element which can be converted to or is a number, and the latter always claims to be numeric null, but maybe ok if we document the behavior doesn't work with these types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this part was weird too, but didn't change it (just copied and adapted the code for ==
). I added this comment to make it clear where the code comes from:
// Code copied and adapted from BinaryBooleanOpExprBase and BinEqExpr.
// The code isn't shared due to differences in code structure: BinaryBooleanOpExprBase + BinEqExpr have logic
// interleaved between parent and child class, but we can't use BinaryBooleanOpExprBase as a parent here, because
// (a) this is a function, not an expr; and (b) our logic for handling and returning nulls is different from most
// binary exprs, where null in means null out.
@Override | ||
public String name() | ||
{ | ||
return "isdistinctfrom"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return "isdistinctfrom"; | |
return "is_distinct_from"; |
@Override | ||
public String name() | ||
{ | ||
return "notdistinctfrom"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh, it is consistent with isnull
/notnull
if (leftVal.isNumericNull() || rightVal.isNumericNull()) { | ||
return ExprEval.ofLongBoolean(leftVal.isNumericNull() && rightVal.isNumericNull()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it doesn't really matter because its covered by the leftVal.value() == null || rightVal.value == null check that happens before we get here.
This default case is also handling array and complex types, which .. is probably not super cool since the former always claims to be numeric null unless it contains only a single element which can be converted to or is a number, and the latter always claims to be numeric null, but maybe ok if we document the behavior doesn't work with these types.
case LONG: | ||
return ExprEval.ofLongBoolean(leftVal.asLong() == rightVal.asLong()); | ||
case DOUBLE: | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, the comparison expressions all handle ARRAY types correctly, should we add it here too? See evalArray
methods on the various BinaryBooleanOpExprBase
implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add.
- Add ARRAY handling to "notdistinctfrom" and "isdistinctfrom". - Include null in pushed-down filters when using "notdistinctfrom" in a join. Other changes: - Adjust join filter analyzer to more explicitly use InDimFilter's ValuesSets, relying less on remembering to get it right to avoid copies.
Pushed a new commit that does:
|
Changes:
Add "isdistinctfrom" and "notdistinctfrom" native expressions.
Add "IS [NOT] DISTINCT FROM" to SQL. It uses the new native expressions
when generating expressions, and is treated the same as equals and
not-equals when generating native filters on literals.
Update join matchers to have an "includeNull" parameter that determines
whether we are operating in "equals" mode or "is not distinct from"
mode.