-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for non-fallthrough null operations in map/flatMap/exists #1302
Conversation
It's interesting to think about the potential issues with AST replacement and simplification. Say that I have a statement like this: query[Holder].map(h => h.value.map(_ + "foo").map(_ + "bar")) Let's have a look at the particular statement: h.value.map(_ + "foo") Which should become: -- AST: if(h.value != null) h.value + "foo" else null
CASE WHEN h.value IS NOT NULL THEN h.value + 'foo' ELSE null (Note that SQL concat is typically not Then when you add the additional -- AST: if(
-- if(h.value != null) h.value + "foo" else null != null)
-- if(h.value != null) h.value + "foo" else null
-- + "bar"
-- else null
CASE WHEN
CASE WHEN h.value IS NOT NULL THEN h.value || 'foo' ELSE null END IS NOT NULL THEN
CASE WHEN h.value IS NOT NULL THEN h.value || 'foo' ELSE null END
+ 'bar'
ELSE null END It's merely the same pattern recursed. This is also how Slick behaves. Now, I introduced the "Right Hand Rule" in SELECT
CASE WHEN h.value IS NOT NULL AND (h.value || 'foo') IS NOT NULL THEN
CASE WHEN h.value IS NOT NULL THEN h.value || 'foo' ELSE null END
+ 'bar'
ELSE null END
FROM Holder h Unfortunately however, the repetition of if(
if(h.value != null) h.value + "foo" else null != null)
/*arbitrary ast*/ if(h.value != null) h.value + "foo" else null /*arbitrary ast*/
else null
// (transform to) =>
if(
if(h.value != null) h.value + "foo" else null != null)
h.value + "foo"
else null ... is very, very difficult to pull-off since multiple layers of nesting, binary operators, etc... can all be holding values of Due to all this, I will not implement "if statement simplification" beyond what I have done already unless there is significant user demand because of the potential complexity of going any further, and the potential brittleness that can result as a consequence. Just verified that Slick behaves exactly the same way as this, which seems to be all the more reason to move forward with this approach. Edit: Comment about Slick |
0780d34
to
27f033b
Compare
Hey @fwbrasil. This is ready to review now. The build is failing because of #1307 which is an unrelated build issue. #1310 is the same as this commit with some additional tests. I changed the way |
@@ -0,0 +1,74 @@ | |||
package io.getquill.ast | |||
|
|||
import io.getquill.ast.{ BinaryOperation => B } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to avoid these aliases, they make the code harder to follow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
835a2b1
to
9161729
Compare
.gitignore
Outdated
@@ -12,8 +12,11 @@ Bug.scala | |||
*.gz | |||
quill-jdbc/io/ | |||
quill-sql/io/ | |||
/io/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these changes necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah. Not really. I can take out.
case OptionIsEmpty(ast) => stmt"${ast.token}.isEmpty" | ||
case OptionNonEmpty(ast) => stmt"${ast.token}.nonEmpty" | ||
case OptionIsDefined(ast) => stmt"${ast.token}.isDefined" | ||
case UncheckedOptionFlatMap(ast, alias, body) => stmt"${ast.token}.flatMapUnchecked((${alias.token}) => ${body.token})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should have the *Unchecked
methods. A better way of doing this would be allowing users to use option.get
, but I think we should avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned with potential performance issues that might be introduced from forcing our users to have case statements in their where
clauses whereever there are nullable fields. Say that they just want to do:
case class Person(name:String, age:Option[Int])
query[Person].filter(p => p.age.map(_ + 1).exists(_ > 18))
The way it is now it's just:
select ... from Person p where p.age + 1 > 18
Now with this change it would be:
select ... from Person p where
((case when p.age is not null then p.age + 1 else null) is not null) &&
((case when p.age is not null then p.age + 1) > 18)
This is significantly more complex for a query analyzer to optimize and the query itself is much less readable (Btw, slick currently does something like this!).
Thus, in this particular query, using the old way (i.e. the way it is now) does not pose any risk of non-ANSI behavior in any database and as a user, in certain places I would definitely prefer it.
This is why I introduced the Unchecked*
methods.
So that our users can control what happens to their queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'd need only option.get
to achieve that, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work but wouldn't users expect a NoSuchElementException
to be thrown in many of those cases? Say you have something like this:
case class Person(name:Option[String])
query[Person].map(p => Some(p.name.get + " M.D."))
Normally, if p.name
is None, we would expect a NoSuchElementException
to be thrown but in our case, the value returned from the query would be None. I think this is pretty inconsistent and might confuse users.
I think we forbade Option.get
for a good reason, it's too hard to guarentee consistency with the regular scala API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get
is as unsafe as the *Unsafe
methods, I'd prefer to have it instead of adding new methods to the DSL. Maybe we could call it unsafeGet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, much better! I wonder if we should add it for now, though. I'd wait for users to push for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, well if we do it in the valueParser
like we do Some.apply
, Option.apply
etc... it's practically trivial. Just do:
case q"$o.orNull" if is[Option[Any]](o) => astParser(o)
We don't need additional AST elements to make it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AST should match the original code as much as possible. It's the case that orNull
has this behavior for SQL but it might not be the case in other target languages. That's why we flatten the options in the sql module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. If really you want to go this direction, let's do it right. I suggest we replace this:
case q"scala.None" => NullValue
case q"scala.Option.empty[$t]" => NullValue
case q"scala.Some.apply[$t]($v)" => astParser(v)
case q"scala.Option.apply[$t]($v)" => astParser(v)
With this:
case q"scala.None" => OptionNone
case q"scala.Option.empty[$t]" => OptionNone // or maybe even have OptionEmpty?
case q"scala.Some.apply[$t]($v)" => OptionSome(astParser(v))
case q"scala.Option.apply[$t]($v)" => OptionApply(astParser(v))
case q"$o.orNull[$t]($v)" if is[Option[Any]](o) => OptionOrNull(astParser(o))
That way in the future, when we want a typed-AST, we have a decent chance of actually making optionals work properly. Wdyt?
I have an active issue for this actually: #1059. I think we should either solve this whole issue of option container-types here and knock out #1059, or we should just let orNull
be treated the same way as Some
and Option.apply
. I don't think we should just make a container for orNull
, that's a very disjointed approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shoot! orNull
does not work for Int
, Double
, etc... How about let's use orNullValue
in situations where orNull
won't work.
} | ||
} | ||
|
||
object Exist { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wdyt about renaming these to IsNotNullCheck
/isNullCheck
/etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NP. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Come to think of it, how about just IsNotNull
/isNotNull
? Also Exist
would become IsNull
/isNull
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be confusing as well. If I see IsNull(ast)
, it gives me the impression that it checks if the ast is null or the AST contains a null literal constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. The only other thing I can think of is IsNotNullPhrase
so it sounds more like a "kind of statement" instead of a "kind of action". Or do you like IsNotNullCheck
more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer IsNotNullCheck
but don't feel strongly about it. Few free to choose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel that strongly about it either. We can go with IsNotNullCheck
@@ -13,6 +13,13 @@ private[dsl] trait QueryDsl { | |||
@compileTimeOnly(NonQuotedException.message) | |||
def querySchema[T](entity: String, columns: (T => (Any, String))*): EntityQuery[T] = NonQuotedException() | |||
|
|||
implicit class NullableColumnExtensions[A](o: Option[A]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we keep these methods, they should be @compileTimeOnly
and throw NonQuotedException()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're all removed
import io.getquill.norm.BetaReduction | ||
|
||
object Implicits { | ||
implicit class AstOpsExt(body: Ast) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid these implicits and use the objects directly? I'm afraid people will see the method available and use it even though it doesn't work for all ast nodes. Also, we could restrict the object
s to support only BinaryOperation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we take them out, we would have statements like this:
apply(BinaryOperation((IsNotNullCheck(ast), BinaryOpertor.`||`, BinaryOperation((IsNotNullCheck(ast), BinaryOperator.`&&`, reduction)))): Ast)
That's really hard for me to parse unless I break it down into multiple lines and even then. At some point I might want to reduce some even more sophisticated simplification rules for null check statements which would make these even more complex.
I'm good with restricting to BinaryOperation
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm suggesting removing only the implicits and calling the objects in this file directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can certainly remove reduce
, that makes sense, especially because BetaReduction is a noun.
What about the +||+
etc.. methods though? Then it would turn into:
apply(`+||+`((IsNotNullCheck(ast), `+&&+`((IsNotNullCheck(ast), reduction)))): Ast)
That's still pretty confusing to me so I'd rather leave them. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, let's keep them 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't make +||+
and +&&+
limited to BinaryOperation
because of this:
case OptionExists(ast, alias, body) =>
val reduction = BetaReduction(body, alias -> ast)
apply((IsNotNullCheck(ast) +&&+ reduction): Ast)
Since the AST is untyped, there's no way to know that the AST inside OptionExists must yield a BinaryOperation
instance.
import io.getquill.ast._ | ||
import io.getquill.ast.Implicits._ | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for this documentation! :)
* rules as well. Note how we are force to null-check both `h.value` as well as `(h.value || 'foo')` because | ||
* a user may use `Option[T].flatMap` and explicitly transform a particular value to `null`. | ||
*/ | ||
object SimplifyNullChecks extends StatelessTransformer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a side note: we're fixing bugs and evolving the SQL generation in quill-sql
. Unfortunately quill-cassandra
doesn't share this code, so we probably should review the modules and try to share more code in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is in quill-core
I had in mind to add it to CqlNormalize
as well. Will do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I assumed it was in quill-sql
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added SimplifyNullChecks
to CqlNormalize
ast match { | ||
|
||
// Center rule | ||
case IfExist( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that these transformations could be implemented in a generic way using on If
nodes? We could move them to quill-core
if it's possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already in quill-core
:)
case q"$o.flatMap[$t]({($alias) => $body})" if is[Option[Any]](o) => | ||
OptionFlatMap(astParser(o), identParser(alias), astParser(body)) | ||
if (isOptionRowType(o) || isOptionEmbedded(o)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you clarify what this condition does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, if it's a row type (e.g. case class, tuple, or embedded), always use Unchecked since since you can't null-check these types in sql. I'll add a comment.
f4e561b
to
982bc47
Compare
* this is needed in order to be able to detect if there are conditionals inside of optional statements | ||
* since they could be nested in the body's subtree. | ||
*/ | ||
class SubtreeSearch(val state: Boolean) extends StatefulTransformer[Boolean] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use https://github.com/getquill/quill/blob/master/quill-core/src/main/scala/io/getquill/ast/CollectAst.scala and check the list size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, looks like I've re-implemented the wheel here. Will change.
a30814c
to
ea35c4b
Compare
Whoops, I forgot to change orNullValue to getOrNull. Will do that |
ea35c4b
to
880c270
Compare
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to merge once the build is fixed 👍
Thanks for addressing my concerns
a12613c
to
701d7ea
Compare
701d7ea
to
39539af
Compare
Merged. Added some documentation too. |
I occours to me that in cases where
|
Fixes for:
#1053
Required prior to:
#1295
Problem
As outlined in #1053, Quill's
Option[T].map
, andflatMap
operations are fundementally broken because they rely on ANSI null-fallthrough behavior (i.e.null || 'foo'
isnull
) which does not work for all cases e.g:will become:
Which will never be
'baz'
since whenstr
is null the conditional does not "fall through" and become null like the string concatonation calse.Oracle makes this problem from tolerable to intractable because it's concat operation is not ansi compliant. This is to say,
null || 'foo'
isfoo
as opposed tonull
. That means that if you translateInto:
Then
str
will never be the value'otherwise
' since in oraclenull || '-suffix'
becomes'-suffix'
. Unlike for conditionals where this kind of behavior is annoying but bearable, simple concatonation breaking fundemental assumptions ofmap
andflatMap
is too inconsistent to allow.These issues combined, force us to introduce explicit null checking into Quill SQL statements whnever
map
,flatMap
orexists
are used.Solution
Add conditional checking to
FlattenOptionOperation
. Also due to issues outlined in #1053, a new AST for table-specific optionalmap
andflatMap
has to be introduced. Also, aSimplifyNullChecks
normalization was introduced in order to reduce overly complex null checks that can occour.Checklist
README.md
if applicable[WIP]
to the pull request title if it's work in progresssbt scalariformFormat test:scalariformFormat
to make sure that the source files are formatted@getquill/maintainers