Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dotty with explicit nulls #6344

Closed
wants to merge 75 commits into from
Closed

Conversation

abeln
Copy link
Contributor

@abeln abeln commented Apr 18, 2019

This PR adds a -Yexplicit-nulls experimental flag which, if enabled, modifies the type system so that reference types are non-nullable. Nullability needs then to be expressed explicitly via unions (e.g. String|Null).

At a high level, the changes to the compiler are:

  1. a new type hierarchy for Null, as described
  2. a "translation layer" from Java types to Scala types, which tries to balance soundness and usability
  3. an implementation of flow-typing so we can support a more natural style of programming with explicit nulls (e.g. if (x != null && x.length < 10))

Internal doc: https://gist.github.com/abeln/573cd43ec062ca0ba2dbdae625d26d42
User-facing doc: https://gist.github.com/abeln/ca50cd5a3b94316c586db5c986049d2d

Copy link
Contributor

@odersky odersky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a very large delta, it would be good to have some overview document detailing what has changed and how everything hangs together. This could be either a comment on the PR itself, or a separate .md file in docs/docs/internal.

@abeln
Copy link
Contributor Author

abeln commented Apr 22, 2019

Can the design doc at https://github.com/abeln/explicit-null-docs/blob/master/design.md serve that purpose?

@odersky
Copy link
Contributor

odersky commented Apr 23, 2019

Can the design doc at https://github.com/abeln/explicit-null-docs/blob/master/design.md serve that purpose?

That would be a great start. What we should still do is:

  • include a variant of this as a doc page under docs/docs/reference. It should be in the same style as
    the other doc pages, i.e. oriented towards a user. As far as I can see, the current doc is already quite close to that.
  • include an implementation note somewhere that explains which parts of the compiler codebase are affected in what way. E.g. A typical question I would like to see answered by that note is: How is the change of the class hierarchy reflected in the compiler codebase? (I would suspect: changes in TypeComparer but it would be good to have that confirmed or else to know there was something else). That's just as an example.

Copy link
Contributor

@liufengyun liufengyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job 👍

I have the first bunch of small issues (typer and types changes are yet to be reviewed).

@abeln
Copy link
Contributor Author

abeln commented Apr 25, 2019

@odersky added the documentation. PTAL.

@abeln abeln force-pushed the dotty-explicit-nulls branch from 33d975d to 1694a4a Compare April 25, 2019 22:28
@abeln
Copy link
Contributor Author

abeln commented Apr 25, 2019

Rebased. Thanks for the comments. Keep 'em coming!

@liufengyun
Copy link
Contributor

test performance please

@dottybot
Copy link
Member

performance test scheduled: 2 job(s) in queue, 1 running.

@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/6344/ to see the changes.

Benchmarks is based on merging with master (1521576)

Copy link
Contributor

@liufengyun liufengyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, Look really good to me 🎉

compiler/src/dotty/tools/dotc/core/SymDenotations.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/core/Types.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/core/Types.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/typer/Typer.scala Outdated Show resolved Hide resolved
case tree: ValDef if ctx.explicitNulls && ctx.owner.is(Method) && !tree.mods.is(Lazy | Implicit) =>
// Use a completer that completes itself with the completion text, so
// we can use flow typing for `ValDef`s.
new ValDefInBlockCompleter(tree)(cctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not for implicit and lazy ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment:
// Don't use this special kind of completer for lazy or implicit symbols,
// because those could be completed through a forward reference:
// e.g.
// val x: String|Null = ???
// y /* y is completed through a forward reference! */
// if (x == null) throw new NullPointerException()
// lazy val y: Int = x.length

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you know these are the only cases that could go wrong? I believe right fix is not to have a ValDefInBlockCompleter and to drop this part.

compiler/src/dotty/tools/dotc/core/Types.scala Outdated Show resolved Hide resolved
@abeln
Copy link
Contributor Author

abeln commented May 3, 2019

@liufengyun thanks for the comments. Will take a look today.

Copy link
Contributor

@abgruszecki abgruszecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the GADT perspective: the "special" TermRef approach in this PR works far better than what we currently get with GADT constraints, so we don't want to change anything in that regard.

In the future, instead of using GADT constraints for non-null types, we may actually want to create special TermRefs for terms affected by GADT constraints - among other things, this would likely fix the member lookup problem in #6323.

@abeln abeln force-pushed the dotty-explicit-nulls branch from 1694a4a to dc60abc Compare May 7, 2019 22:20
@abeln
Copy link
Contributor Author

abeln commented May 7, 2019

@odersky would you like to take a look at this (particularly the flow typing part)?

@odersky
Copy link
Contributor

odersky commented May 8, 2019

I'll get to this now.

Copy link
Contributor

@odersky odersky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I reviewed only the architectural design, not so much the code in detail. What I saw was very nice, well explained and of high quality. I do have some feed back on the architecture, though:

The most important issues are:

  • There should be no RefEq trait
  • The approach to NonNullTermRefs and ValDefInBlockCompleter each violate hard design principles of the compiler and need to be rethought.

Details are in individual comments. I'll wait until these points are addressed and then give it another round.

compiler/src/dotty/tools/dotc/core/Definitions.scala Outdated Show resolved Hide resolved
docs/docs/reference/other-new-features/explicit-nulls.md Outdated Show resolved Hide resolved
docs/docs/internals/explicit-nulls.md Show resolved Hide resolved
docs/docs/internals/explicit-nulls.md Outdated Show resolved Hide resolved
final class NonNullTermRef(prefix: Type, designator: Designator) extends TermRef(prefix, designator) {
// There's nothing special about this class: it's just used as a marker to identify certain
// `TermRef`s in `computeDenot`.
}
Copy link
Contributor

@odersky odersky May 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I don't think this will work. Since TermRef is declared as cached, it has to satisfy that protocol. But NonNullTermRef is not cached. This will cause problems. Probably only in rare situations, but when they arise they will be impossibly hard to debug.

final class NonNullTermRef(prefix: Type, designator: Designator) extends TermRef(prefix, designator) {
// There's nothing special about this class: it's just used as a marker to identify certain
// `TermRef`s in `computeDenot`.
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to work on the denotation instead. Use a normal TermRef, but give it a special denotation that strips nulls.

docs/docs/internals/explicit-nulls.md Show resolved Hide resolved
compiler/src/dotty/tools/dotc/core/Contexts.scala Outdated Show resolved Hide resolved
docs/docs/internals/explicit-nulls.md Show resolved Hide resolved
docs/docs/internals/explicit-nulls.md Show resolved Hide resolved
@abeln
Copy link
Contributor Author

abeln commented May 9, 2019

@odersky I'm working through the requested changes, but I have a question about NonNullTermRef. You said

In fact, I don't think this will work. Since TermRef is declared as cached, it has to satisfy that protocol. But NonNullTermRef is not cached. This will cause problems. Probably only in rare situations, but when they arise they will be impossibly hard to debug

It might be better to work on the denotation instead. Use a normal TermRef, but give it a special denotation that strips nulls.

I tried using a normal TermRef and changing the denotation, but ran into problems with the following example:

class Test {
  def foo(): Unit = {
    val x: String|Null = ???
    if (x != null) {
      val y = x.length
    } else {
      val y = x
    }
}

The problem is that since TermRef is cached, both x.length and x reuse the TermRef for x. However, the denotations are different:String vs String|Null, respectively. That means that when val y = x is typed, the denotation for x gets set to String. This in turn triggers a -Ychecks assertion failure while re-typing val y = x.length:

  1 } of class class dotty.tools.dotc.ast.Trees$PackageDef # 1115
  2 *** error while checking kk.scala after phase frontend ***
  3 exception occurred while compiling kk.scala
  4 java.lang.AssertionError: assertion failed: symbols differ for x.length
  5 was                 : method length
  6 alternatives by type:  of types
  7 qualifier type      : (String | Null)(x)
  8 tree type           : ((): Int)(x.length) of class class dotty.tools.dotc.core.Types$CachedTermRef while     compiling kk.scala
  9 Exception in thread "main" java.lang.AssertionError: assertion failed: symbols differ for x.length
 10 was                 : method length
 11 alternatives by type:  of types
 12 qualifier type      : (String | Null)(x)
 13 tree type           : ((): Int)(x.length) of class class dotty.tools.dotc.core.Types$CachedTermRef
 14         at dotty.DottyPredef$.assertFail(DottyPredef.scala:16)
 15         at dotty.tools.dotc.transform.TreeChecker$Checker.typedSelect(TreeChecker.scala:331)
 16         at dotty.tools.dotc.typer.Typer.typedNamed$1(Typer.scala:2121)

More generally, there seems to be the assumption in Context that TermRefs can be cached based just on their name. This seems to be no longer true once we have flow-typing, so it seems to me like we have two options:

  1. modify the cache so that TermRefs are cached based on (prefix, name, isNull) instead of just (prefix, name). This keeps the invariant that all TermRefs are cached, but there is perhaps a performance penalty by caching the extra bit.
  2. we can try to follow the current approach and not cache these "special" non-null TermsRefs. This would require re-working the current implementation so that not all NamedTypes are cached.

I think 1) would probably be the less-invasive option, and it's nice to keep the invariant that all TermRefs are cached, but I don't have a good sense for the performance implications.

Any thoughts?

@odersky
Copy link
Contributor

odersky commented May 11, 2019

@abeln I think your reasons for not using Denotation make sense. So it seems have to come back to using NonNullTermRef. But that needs to be cached then.

Some info in caching:

  • there are three hash tables, one for NamedTypes, one for AppliedTypes, and one for everything else. This is done for performance, so that we can specialize the treatment of NamedTypes and Applied Types, which are more common than the others.
  • I believe NonNullTermRefs could actually be hashed in the "others" table. That would keep the treatment of normal termrefs as streamlined as possible. So you should check out the way, say, ThisTypes are cached, and imitate that, taking care we do get overriding problems with the treatment of normal TermRefs.

@abeln
Copy link
Contributor Author

abeln commented May 24, 2019

A (non) update on this. I'm still looking into how to cache NonNullTermRefs (as well as at Martin's other comments). I'm away on vacation for next week, and will get back to this when I'm back.

@odersky
Copy link
Contributor

odersky commented Jun 4, 2019

@abeln OK, we can wait a bit. The plan is to announce at Scala Days that we have fixed our "feature envelope" for 3.0. This is not yet a feature freeze, so some changes are still possible. But new features other than the ones we have now will have to wait for later versions. We can include explicit nulls provisionally in that envelope. They'd need to be fully worked out and tested by the time we go into real feature freeze later this year.

@abeln abeln force-pushed the dotty-explicit-nulls branch 3 times, most recently from 241f8d7 to 8fc0113 Compare June 15, 2019 01:19
@odersky
Copy link
Contributor

odersky commented Nov 12, 2019

Hi @abeln @noti0na1 @olhotak

#7520 is bascically done now. It implements flow analysis for nullability that works for vals as well as vars. The results of the analysis are contained in the notNullInfos member of Context. To find out whether a path or var reference TermRef p is known to be not null, use:

   ctx.notNullInfos.containsRef(p)

There are also some new developments which should simplify the treatment afterwards. There's a new trait NotNull, which is a supertrait of AnyVar and AnyRef. NotNull is treated specially in TypeComparer in that we make use of the fact that Null & NotNull = Nothing. That knowledge is exploited for subtype checks and glb operations. In particular, we can now define
a method

  def notNull(x: Any): x.type & NotNull = 
    assert(x != null)
    x

(right now we need a cast at the end, but this will go away one we have integrated flow analysis with the rest of explicit nulls). Using the method, we use normal typing to determine that if e: T | Null then notNull(e) has type T, for any T.

This means we don't need NotNullTermRef anymore. NotNullTermRef is problematic since it's
not clear we can re-do not null typing in later phases where we do not have the flow information available (and we need to do that to pass -Ycheck tests). I believe it's better to leave a trace in the code that explicitly asserts that a type is not null, backed by what we detected from flow analysis. The notNull method above fits the bill for this: Whenever we have a path or variable reference p that is known by flow analysis to be not null, we should replace it with notNull(p). That way we have left explicit evidence about the not-nullness of p.

So, I propose we rework the present PR to go with #7520. In particular

  • add an "official" notNull function somewhere, either in the library or as a special method in Definitions.
  • Maybe calls to notNull should be marked as pure expressions
  • implement the scheme to wrap references that are known to be null in notNull calls.
  • Drop `NotNullTermRef; it will be no longer needed.

WDYT?

@abeln
Copy link
Contributor Author

abeln commented Nov 12, 2019

@odersky this sounds great. We had trouble with NonNullTermRef before in the context of recovering trees from TASTy, because we could no longer recover the types specialized by flow typing (because the flow typing doesn't run when deserialization happens).

Does notNull interact well with path-dependent types, though?
e.g.

class Foo {
  type T = String
}

val x: Foo|Null = ???
if (x != null) {
  // x: String
  val y: x.T = "hello" // this should work, right?
}

However, if we wrap non null TermRefs in calls to notNull, then the above will break

val x: Foo|Null = ???
if (x != null) {
  val y: notNull(x).T = "hello" // error: notNull(x) is not stable
}

Also, what about interactions between flow typing and implicits?

def foo(implicit s: String) = ???

implicit val x: String|Null = ???
if (x != null) {
  foo
}

It seems like the call to foo will fail because we don't have any implicit definitions with type String (since x still has type String|Null).

I'll send you my comments for #7520 today, but the code is very nice and I'm glad you got the vars working.

@abeln
Copy link
Contributor Author

abeln commented Nov 12, 2019

@odersky we created #7546 which is a copy of this PR minus the flow typing part. That one should be reviewable while we figure out flow typing.

@olhotak
Copy link
Contributor

olhotak commented Nov 12, 2019

I like the code in #7520 and I agree that we should rebase the explicit nulls PR to use it. I read the code and did not have any low level comments, but I think @abeln has some.

I really like the idea of the notNull method, or more generally of making the non-nullness information explicit in the tree to avoid the need for a NotNullTermRef.

Is the eventual plan to remove or inline the notNull calls in a later pass? If so, do we need to prevent user code from explicitly calling notNull (since that would be unsound if the assertion gets removed)? Also, if we eventually remove the notNull calls, what are the advantages and disadvantages of using a call to notNull as the marker as compared to using a cast asInstanceOf[x.type & NotNull] as the marker?

@olhotak
Copy link
Contributor

olhotak commented Nov 12, 2019

@abeln inside a path-dependent type, we don't need the non-nullness information, do we? So can't we just not wrap the x with notNull when it's the prefix of a type?

Regarding implicits, are they not rewritten to explicit parameter passing in the typer? So that code would need to look in the context for the nullness information and rewrite the call foo to foo(notNull(x)) (instead of rewriting it to foo(x)).

@abeln
Copy link
Contributor Author

abeln commented Nov 12, 2019

@abeln inside a path-dependent type, we don't need the non-nullness information, do we? So can't we just not wrap the x with notNull when it's the prefix of a type?

If we don't wrap the x then x.T will fail to typecheck, because x has type Foo|Null, which does not have a type member T.

Regarding implicits, are they not rewritten to explicit parameter passing in the typer? So that code would need to look in the context for the nullness information and rewrite the call foo to foo(notNull(x)) (instead of rewriting it to foo(x)).

Ok, so the implicit resolution logic will have to take the new non-null information into account.

@Atry
Copy link
Contributor

Atry commented Nov 13, 2019

I wonder if we can make this PR an opt-out feature. A developer must import scala.language.uncheckedNull to suppress the checking (all A <: AnyRef types are considered as A | Null). The ability of per-file opt-out will help people to migrate from existing code fluently.

@abeln
Copy link
Contributor Author

abeln commented Nov 13, 2019

@Atry right now the changes are opt-in (you have to set '-Yexplicit-nulls' for the type system changes to kick in). Once we're more confident that things are working well (better testing, the compiler bootstraps with explicit nulls, etc), I imagine we'd turn it on by default (in which case we'd probably want opt-outs like you mention).

@Atry
Copy link
Contributor

Atry commented Nov 13, 2019 via email

@sjrd
Copy link
Member

sjrd commented Nov 13, 2019

You can't make this per file, because it changes subtyping, and subtyping is a global function, with caches, inter-file dependencies, etc. So unfortunately that's not going to happen, however desirable it can be.

@sjrd
Copy link
Member

sjrd commented Nov 13, 2019

An alternative definition of notNul that does not require the introduction of a NotNull trait:

def notNull[A](x: A | Null): x.type & A = {
  assert(x != null)
  x.asInstanceOf[x.type & A]
}

Besides not requiring NotNull, it can generalize to flow typing for other things than Null.

Really the introduction of NotNull is a huge smell to me. It means Null will forever be special in the language, whereas the whole point of this feature is to make it like all the other types!

@Atry
Copy link
Contributor

Atry commented Nov 13, 2019

You can't make this per file, because it changes subtyping, and subtyping is a global function, with caches, inter-file dependencies, etc. So unfortunately that's not going to happen, however desirable it can be.

We can use the technology similar to Java Interop. When type-checking an explicitly-declared type tree or a new expression tree inside a scope with import scala.language.uncheckedNull, a | Null is appended to the type if it is a subtype of AnyRef. We don't have to patch all other trees.

// LegacyClass.scala
import scala.language.uncheckedNull
case class LegacyClass()
object LegacyClass {
  // Inferred as LegacyClass | Null
  def newLegacyClass = new LegacyClass()

  // Inferred as LegacyClass | Null
  def legacyClass = LegacyClass()

  // Inferred as NullSafeClass | Null
  def newNullSafeClass = new NullSafeClass()

  // Inferred as NullSafeClass
  def nullSafeClass = NullSafeClass()

  // Patched as NullSafeClass | Null
  def explicitlyTypedNullSafeClass: NullSafeClass = NullSafeClass()

  // compiles because the type is patched as ((LegacyClass | Null) <:< Null) | Null
  implicitly[LegacyClass <:< Null]

  // compiles because the type is patched as ((NullSafeClass | Null) <:< Null) | Null
  implicitly[NullSafeClass <:< Null]

}
// NullSafeClass.scala
case class NullSafeClass()
object NullSafeClass {

  // Inferred as LegacyClass
  def newLegacyClass = new LegacyClass()

  // Inferred as LegacyClass | JavaNull
  def legacyClass = LegacyClass()

  // Does not compile
  // def legacyClass: LegacyClass = LegacyClass()


  // Inferred as NullSafeClass
  def newNullSafeClass = new NullSafeClass()


  // Inferred as NullSafeClass
  def nullSafeClass = NullSafeClass()


  // Does not compile
  // implicitly[LegacyClass <:< Null]


  // Does not compile
  // implicitly[NullSafeClass <:< Null]

}

@Atry
Copy link
Contributor

Atry commented Nov 13, 2019

Or we can create a deprecated implicit conversion for migration.

@deprecated(since="3.0", message="Unchecked null")
implicit def notNull[A](x: A | Null): x.type & A = {
  x.asInstanceOf[x.type & A]
}

@odersky
Copy link
Contributor

odersky commented Nov 13, 2019

I really like the idea of the notNull method, or more generally of making the non-nullness information explicit in the tree to avoid the need for a NotNullTermRef.

Is the eventual plan to remove or inline the notNull calls in a later pass? If so, do we need to prevent user code from explicitly calling notNull (since that would be unsound if the assertion gets removed)? Also, if we eventually remove the notNull calls, what are the advantages and disadvantages of using a call to notNull as the marker as compared to using a cast asInstanceOf[x.type & NotNull] as the marker?

Here's a refinement that also addresses the issue @abeln had with path types (I agree this is a real problem): Make notNull a value member of Any. I.e. like this:

  class Any
    final val $nn: this.type & NotNull

This means that x.T can be expanded to x.$nn.T because x.$nn is still a path. $nn selections are removed in Erasure.

At first, I was tempted to merge $nn and nn, but that would not work. nn is not a pure method, since it can throw an NPE. But one can define nn like this:

  def (x: Any) nn: x.type & NotNull = 
    assert(x != null)
    x.$nn

@srjd: In this design Null and NotNull are indeed special because we know that NotNull & Null = Nothing, and because we track not-nullness with flow typing. Tracking other types as well this way would be a big design change. But even if we would do that at some point in the future, we'd always need the notion of a complement to a tracked type. Using intersection with the complement is the most natural way to retract a union with the original tracked type.

One could discuss whether we should make NotNull part of the class hierarchy or treat it as an unrelated magical trait that gets treated specially in subtyping. Adding it to the class hierarchy is simpler. The subtype checker is already complicated, we should not add more special rules to it without having a good reason. On the other hand,this design is less easy to extend to new tracked types.

@odersky
Copy link
Contributor

odersky commented Nov 13, 2019

@sjrd A systematic way to generalize what we do for Null would be to put the notion of a type complement not[T] into the type system. But that would be a rather drastic change and it looks like it would need a lot of research on foundations. Has something like this been already studied?

@sjrd
Copy link
Member

sjrd commented Nov 13, 2019

You don't need a full generalization to not[T]. The only useful thing that NotNull adds in TypeComparer is that it can simplify Null & NotNull to Nothing. That assumes that something, at some point, from a type T, creates a T & NotNull. That something is compiler code, not user code. I can't figure out where that is done in the current PRs, but I assume it looks like

def notNull(tpe: Type): Type =
   tpe & TypeRef(NotNullClass)

The only reason this is useful is because TypeComparer will simplify Null & NotNull, and that really only happens at the top-level (i.e., T =:= Null) or in the branches of a union type (i.e., T is of the form U | Null).

Instead of inventing NotNull and then simplifying Null & NotNull in TypeComparer, you can simply prune away the Nulls right inside def notNull:

def notNull(tpe: Type): Type = tpe match {
  case _ if tpe.isRef(NullClass) => NothingClass
  case UnionType(tp1, tp2) => notNull(tp1) | notNull(tp2)
  case _ => tpe
}

This is not technically as precise as T & NotNull, but it is so for all practical purposes (i.e., for all subsequent typechecking decisions that happen in a reasonable program, and in particular refining A | Null to A in the else branch of an if (x == null)).

And that strategy you can actually generalize to arbitrary Us. You can define a function typeSubtract that removes Us from the branches of a T:

def typeSubtract(tp1: Type, tp2: Type): Type = tp1 match {
  case _ if tp1 <:< tp2 => NothingClass
  case UnionType(tp11, tp12) => typeSubtract(tp11, tp2) | typeSubtract(tp12, tp2)
  case _ => tp1
}

For the purposes of flow typing, this is really all you need to write useful programs. For a branch of the form

val x: T = ...
if (x.isInstanceOf[U]) {
  // here x: U
} else {
  // here x: typeSubtract(T, U)
}

When T is of the form A | U (in particular when U is Null), you can successfully assign

val y: A = x

in the else branch.

This design will be significantly simpler than what you are proposing with NotNull. Imagine removing all the code that #7520 adds just to handle NotNull, and instead add the def typeSubtract (or even just def notNull) from above.

@abeln
Copy link
Contributor Author

abeln commented Nov 13, 2019

@sjrd we already have a method like notNull, called stripNull: https://github.com/lampepfl/dotty/pull/6344/files#diff-1eb16c70e7f38d8e52d516b5e1151a56R99

It seems that if we only have notNull the method and not NotNull the type, then we can't handle the combination of flow typing and path-dependent types.

The example I gave above

class Foo {
  type T = String
}

val x: Foo|Null = ???
if (x != null) {
  // x: String
  val y: x.T = "hello" // this should work, right?
}

The solution that Martin suggested

 class Any
    final val $nn: this.type & NotNull

and then we expand x.T to x.$nn.T.

This requires having a NotNull type.
Maybe this is not a common-enough example to tilt the scales, but we should explicitly say what we'll do in this case, I think.

@abeln
Copy link
Contributor Author

abeln commented Nov 13, 2019

Also notice that @odersky's notNull is a regular library method:

  def notNull(x: Any): x.type & NotNull = 
    assert(x != null)
    x

while @sjrd's notNull (which is like our stripNull) is a method in the compiler:

def notNull(tpe: Type): Type = tpe match {
  case _ if tpe.isRef(NullClass) => NothingClass
  case UnionType(tp1, tp2) => notNull(tp1) | notNull(tp2)
  case _ => tpe
}

I suggest we use different names; otherwise it's gonna get confusing :)

You don't need a full generalization to not[T]. The only useful thing that NotNull adds in TypeComparer is that it can simplify Null & NotNull to Nothing. That assumes that something, at some point, from a type T, creates a T & NotNull. That something is compiler code, not user code. I can't figure out where that is done in the current PRs, but I assume it looks like

We are not adding the calls to (the user level) notNull yet. The current implementation of flow typing that we (Waterloo) have, uses a combination of stripNull plus a special type of TermRef that's always guaranteed to be non-nullable. Martin's suggestion was that replace the special TermRef by calls to (user-level) notNull inserted in user code by the compiler (when flow typing says it's safe to do so).

@sjrd
Copy link
Member

sjrd commented Nov 13, 2019

OK I'm going to use stripNull to refer to the compiler function that works on types, and notNull for the library function.

Using my version of the library notNull that I wrote somewhere:

def notNull[A](x: A | Null): x.type & A = {
  assert(x != null)
  x.asInstanceOf[x.type & A] // safe
}

then you have compiler code that can insert calls to notNull, by instantiating A to stripNull(tpe) where tpe is the known type of x. So for example if the type of x is B | Null, you would insert notNull[B](x).

@sjrd
Copy link
Member

sjrd commented Nov 13, 2019

The example with the path-dependent type that should be valid under flow typing is not supported by my idea, but you can make it work by writing the user program a bit differently:

class Foo {
  type T = String
}

val x: Foo|Null = ???
if (x != null) {
  val x2: x.type & Foo = x
  val y: x2.T = "hello" // this now works
}

I'd like to see real world examples of path-dependent types used under != null guards if that is the argument why NotNull should be needed. My gut feeling is that this is exceedingly rare, and in those rare instances you can rewrite as shown above.

@abeln
Copy link
Contributor Author

abeln commented Nov 13, 2019

It's nice that stripNull could give us the type argument that we'd pass to notNull. I hadn't thought of that.

What would we do with path-dependent types; i.e. #6344 (comment) ?

EDIT: I see your comment now :)

@sjrd
Copy link
Member

sjrd commented Nov 13, 2019

What would we do with path-dependent types; i.e. #6344 (comment) ?

Cross-post. See my comment above yours ;)

@abeln
Copy link
Contributor Author

abeln commented Nov 13, 2019

I'd like to see real world examples of path-dependent types used under != null guards if that is the argument why NotNull should be needed. My gut feeling is that this is exceedingly rare, and in those rare instances you can rewrite as shown above.

I agree that this seems rare. In fact, in our experiments so far, any use of flow typing seems relatively rate (we'll have results to show soon). Since there is a workaround, I would be ok with implementing the scheme without theNotNull type.

@odersky
Copy link
Contributor

odersky commented Nov 14, 2019

I think just because something is not yet used in our examples does not mean we can ignore it. We have just scratched the surface with examples; there will be many more use cases to come.

I believe the rule in the spec should be as follows:

If the compiler typechecks a path or variable reference that is known by flow typing to be not null it will append a .$nn. $nn is a method in Any with signature

  @stable def $nn: this.type & NotNull

This is at the same time elegant and powerful. I would not like to have to put in the spec that we do this only for expressions and for types or patterns you have to rewrite the program.

Having NotNull is also important for type propagation:

  def f[T](xs: Set[T]): T & NotNull = 
    val x = xs.head
    assert(x != null)
    x    // works for Set[String] as well as Set[String | Null]

and for typing a Java-like Map get:

  def [K, V <: NotNull](m: Map[K, V]) get (k: K): V | Null

In short, NotNull seems to lead to both a clean spec and more expressiveness. I do not think we should drop it.

Another way to put it is that, if we have a flow analysis that asserts a property X (in this case, being NotNull) then it is natural that X should itself be expressible as a type. Yes, maybe you can work around many of the cases, but why not tackle it directly? And how can we convince ourselves that the workarounds are sufficiently general?

@odersky
Copy link
Contributor

odersky commented Nov 14, 2019

I agree that this seems rare. In fact, in our experiments so far, any use of flow typing seems relatively rate (we'll have results to show soon).

Does that also hold for flow typing for variables? I would expect flow typing for vals to be rare but flow typing for vars to be quite common.

@odersky
Copy link
Contributor

odersky commented Nov 15, 2019

This PR is now superseded by #7546 and #7556, so it can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.