Overconstrained GADT bounds in presence of bad bounds #11545

mario-bucev · 2021-02-26T13:46:43Z

Compiler version

3.0.0-RC1 (Scastie)

Minimized code

@main def test: Unit = {
  trait S[A]
  trait Inv[A]

  class P[X] extends S[Inv[X] & Inv[String]]

  def patmat[A, Y](s: S[Inv[A] & Y]): A = s match {
    case p: P[x] =>
      "Hello"
  }
  
  val got: Int = patmat[Int, Inv[String]](new P) // ClassCastException: String cannot be cast to Integer
}

Output

Click to expand

java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
	at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:99)
	at main$package$.test(main.scala:11)
	at test.main(main.scala:1)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at sbt.Run.invokeMain(Run.scala:115)
	at sbt.Run.execute$1(Run.scala:79)
	at sbt.Run.$anonfun$runWithLoader$4(Run.scala:92)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at sbt.util.InterfaceUtil$$anon$1.get(InterfaceUtil.scala:10)
	at sbt.TrapExit$App.run(TrapExit.scala:257)
	at java.lang.Thread.run(Thread.java:748)

Expectation

The code should be rejected, but it is unclear what part is problematic:

The instantiation of new P due to introducing the bad bounds S[Inv[Int] & Inv[String]]
The class declaration of P with the extends S[Inv[X] & Inv[String]] clause, which may lead to bad bounds if X != String
Deducing that x = A = String

I think that the problem comes from 1 (and not necessarily from 3). I believe, however, that this is not sufficient, as it seems that bad bounds can still be crafted.

Translating the above into pDOT shows that we can allow 3. if we require a witness value for Inv[A] & Y. Then, the above would not cause a ClassCastException since providing a value for Inv[Int] & Inv[String] is not possible.

The text was updated successfully, but these errors were encountered:

LPTK · 2021-02-27T01:53:55Z

The culprit is clearly 3. In Scala/pDOT, you can't escape type members with members that themselves end up having bad bounds (as you say, bad bounds can still be crafted). So we have to live with bad bounds.

In the pattern match, we know Inv[x] & Inv[String] =:= Inv[A] & Y. It is utterly wrong to decompose that constraint into Inv[x] =:= Inv[A] and Inv[String] =:= Y, which is what Dotty seems to be doing. The latter implies the former, so it's okay for approximating type inference, but the former does not imply the latter.

In DOT-like terms, we have Inv[x | String .. x & String] =:= Inv[A] & Y.
So we do have Inv[x | String .. x & String] <: Inv[A] and thus A <: x | String and x & String <: A.
We also have Inv[A] & Y <: Inv[x | String .. x & String], from which we can't deduce anything (think that Y could be bottom).
None of these implies String <: A, which is needed to make the pattern match type check.

LPTK · 2021-03-01T03:23:58Z

@liufengyun the issue is not truly solved, as you can see in this slight variaton:

class test0 {
  type AA
  trait S[A]
  trait Inv[A]

  class P[X] extends S[Inv[X] & AA]

}

@main def test: Unit =
  new test0:
    type AA = Inv[String]
    
    def patmat[A, Y](s: S[Inv[A] & Y]): A = s match {
      case p: P[x] =>
        "Hello"
    }
    
    val got: Int = patmat[Int, AA](new P) // ClassCastException: String cannot be cast to Integer

Should it be reopened?

I'm aware that @odersky said in #11549 there were still holes to be fixed by the community, but this particular problem is never going to be fixed completely. The theory of DOT shows we can't avoid bad bounds, so we have to be sound in their presence, and GADT reasoning is currently not.

liufengyun · 2021-03-01T08:25:46Z

@LPTK In the example, it seems to me type inference (GADT reasoning) is doing the right thing:

P[x] <:< S[Inv[A] & Y]
=> S[Inv[x] & String] <: S[Inv[A] & Y]
=> Inv[x] & Inv[String] =:= Inv[A] & Y
=> Inv[A] <:< (Inv[x] & Inv[String])
=> Inv[A] <:< Inv[x] and Inv[A] <:< Inv[String]
=> x =:= A =:= String

If I understand correctly, handling bad bounds in type inference is more convoluted and has performance penalties.

Another possibility is to add the inferred GADT bounds that are valid to the method type parameters (if they are valid in the method scope). This needs to propagate F-bounds outward and do the join. I'm not sure how practical it is.

/cc: @abgruszecki

LPTK · 2021-03-01T09:11:56Z

@liufengyun

...
=> Inv[x] & Inv[String] =:= Inv[A] & Y
=> Inv[A] <:< (Inv[x] & Inv[String])
...

How do you justify that step? Similar to what I've said above, Inv[A] & Y <: Inv[x] & Inv[String] does not imply Inv[A] <: Inv[x] & Inv[String]. You can't just remove an intersected term on the left.

This is not about handling bad bounds in type inference, this is about the soundness of type checking GADTs. I wonder why you think the compiler is doing the right thing, despite the fact that executing the program leads to a ClassCastException 🙂

liufengyun · 2021-03-01T09:33:21Z

How do you justify that step? Similar to what I've said above, Inv[A] & Y <: Inv[x] & Inv[String] does not imply Inv[A] <: Inv[x] & Inv[String]. You can't just remove an intersected term on the left.

Indeed it does not hold generally. I read your previous comment but mistakenly thought there's a way out that only resorts to subtyping.

Nevertheless, given that Inv is invariant, the step seems to be the only possibility to make =:= hold, thus a valid reasoning step for this particular case.

LPTK · 2021-03-01T09:39:49Z

given that Inv is invariant, the step seems to be the only possibility to make =:= hold

That's not true. Again, referring to my message above, taking Y = Nothing is enough to make Inv[A] & Y <: Inv[x] & Inv[String] hold. If you want to justify a subtyping relationship, you have to make it according to the rules of the type system!

PS: To specifically answer your argument about =:= (and not just <:<) – there is no rule AFAIK to say that Inv[x] & Inv[String] =:= Inv[A] & Nothing does not hold. In fact, if x and String happen to be different, then Inv[x] & Inv[String] is indeed functionally equivalent to Nothing.

liufengyun · 2021-03-01T10:39:46Z

there is no rule AFAIK to say that Inv[x] & Inv[String] =:= Inv[A] & Nothing does not hold.

If there is no rule says that they hold, they don't hold ;-)

LPTK · 2021-03-01T11:10:16Z

You have it backwards. The burden of proof is on you to show that it cannot possibly hold, otherwise this GADT reasoning is unsound, as we can see from the ClassCastException this program raises.

abgruszecki · 2021-03-12T10:57:06Z

I'm reopening this. @LPTK correctly observed that we're still unsound. The correct solution seems to be to check whether we're in GADT constraint inference mode before (approximately) branching on &.

odersky · 2021-04-06T07:40:34Z

The theory of DOT shows we can't avoid bad bounds, so we have to be sound in their presence, and GADT reasoning is currently not.

That's not quite true. DOT says that good bounds are only guaranteed for members of values. It puts the burden of guaranteeing good bounds on when we construct values. So that's what goes wrong here. Type parameters of classes are treated like member types. Non-variant type parameters are treated like type aliases and we need to have guarantee good bounds for them.

Meanwhile #11995 shows that my fix caused important code out there to fail. So that does not seem to work either.

It looks like an important and urgent problem and it would be good if the people discussing here could invest themselves. How can we let real world code compile and still ensure soundness? If you have some time any contribution to this issue and #11995 would be much appreciated.

LPTK · 2021-04-06T11:45:41Z

That's not quite true. DOT says that good bounds are only guaranteed for members of values.

Yes, of course. I meant that we can't avoid bad bounds in general, more specifically in the members of types for which we don't have values around. My point stands that the problem is never going to be solved by filling holes which are only special cases of a fundamentally unavoidable property.

How can we let real world code compile and still ensure soundness?

By removing the ad-hoc "fixes" and confronting the reality that we can't infer String <: A from Inv[x] & Inv[String] =:= Inv[A] & Y, because that's just not sound. The constraint solver used for GADT reasoning needs to be fixed so it always remains conservative in the conclusions it draws from inferred constraints. Unlike @abgruszecki and the students he supervises, I am not (and have no plans to become) familiar with that part of the code at all, so the several people currently working on it should take it up themselves.

mario-bucev · 2021-04-07T10:01:17Z

Correct me if I'm wrong, but I think that this issue should not be closed by PR #12000 (GitHub seems to have linked it because of the sentence "need to check to fix #11545")

abgruszecki · 2021-04-07T10:02:35Z

This was closed accidentally.

Looking at the issue, I think what is happening is that we take a wrong step in TypeComparer. I'll have to investigate where exactly. I'd guess that we do something that is correct as far as type inference goes, but gives wrong results for inferring GADT constraints. Specifically, the GADT logic should run the following type comparison : Inv[A] & Y =:= Inv[X] & Inv[String], and we simply shouldn't infer from this comparison that A = X = String.

odersky · 2021-04-14T08:33:42Z

I think whatever we do, this should not be part of 3.0.0. We had two previous soundness patches that we had to fix or revert for RC3. Restricting the range of programs that are compiled needs to be well considered, and we need time for the community to experiment.

LPTK · 2021-04-14T10:27:46Z

I don't think this is a language design issue that the community needs to experiment with. Instead, this is a clear-cut constraint solving bug. Here is what I gather the issue is:

Some constraints can't be solved directly, and need to be approximated.
To approximate a constraint C0 in type inference, we need to find a constrain C1 that implies C0. It's fine if we pick something too strong, as that just means we will infer a type that is not as general.
To approximate a constraint C0 in GADT reasoning, we need to find a constrain C1 that is implied by C0. It's fine if we pick a constraint that is too weak, as that just means we will not leverage all the subtyping information present in the currently considered pattern match.

What Dotty currently does is to approximate a GADT constraint in the wrong direction, as if it was a type inference constraint.

Looking at @abgruszecki's PR, it is clear that the culprit is the use of baseType on the LHS of a constraint: baseType returns one possible base type of a given type, so that for instance Inv[A] & B <: C is approximated to Inv[A] <: C, because Inv[A] is considered a base type of Inv[A] & B.
In essence, the reasoning reduces the constraint (C0) Inv[A] & B <: C to just (C1) Inv[A] <: C, which is perfectly sound for type inference, but as we've seen is not sound for GADT reasoning: while (C1) implies (C0), the reverse is not true! For instance, if B is later instantiated to Nothing, it is clear that (C0)+(B := Nothing), i.e., Nothing <: C, does NOT imply (C1).

LPTK · 2021-04-14T10:39:57Z

As a meta-remark: I'm very suspicious about the use of the type comparer to deal with GADT constraints. Every place where constraints are approximated during type comparison is a potential soundness hole, unless it has specific logic to avoid the approximation or do it in the other direction in case we are in GADT mode. I'm pretty sure everything would be a lot simpler and, more importantly, less bug-prone, if the GADT constraint solving algorithm was completely separate. In particular, there are many complexities and heuristics needed for type inference that are unnecessary or plain unsound when used for GADT reasoning.

smarter · 2021-04-14T10:44:06Z

unless it has specific logic to avoid the approximation or do it in the other direction in case we are in GADT mode

We do have that logic, isSubType has three parameters: tp1: Type, tp2: Type, a: ApproxState, the ApproxState is set when we do an approximation and the GADT logic checks that. Of course it's hard to guarantee that this is always done correctly.

dwijnand · 2021-09-10T11:41:31Z

the ApproxState is set when we do an approximation and the GADT logic checks that

Also Mode.GadtConstraintInference and TypeComparer#useNecessaryEither, at least, I think.

dwijnand · 2021-09-10T17:04:06Z

There's also

  def testSubType(tp1: Type, tp2: Type): CompareResult =
    GADTused = false
    if !topLevelSubType(tp1, tp2) then CompareResult.Fail
    else if GADTused then CompareResult.OKwithGADTUsed
    else CompareResult.OK

which seems excitedly on point for what one might want in order to use the subtyping algorithm, biased towards GADT reasoning (testSubType(..) != CompareResult.Fail) but, at least for the usage I was trying to make of it, it gave the wrong answer. 😞

mario-bucev added the itype:bug label Feb 26, 2021

liufengyun linked a pull request Feb 28, 2021 that will close this issue

Recursively check nonvariant arguments of base types for realizability #11549

Merged

liufengyun closed this as completed in #11549 Feb 28, 2021

abgruszecki reopened this Mar 12, 2021

abgruszecki self-assigned this Mar 12, 2021

odersky mentioned this issue Apr 6, 2021

Revert "Recursively check nonvariant arguments of base types for realizability" #12000

Merged

abgruszecki closed this as completed in #12000 Apr 7, 2021

abgruszecki reopened this Apr 7, 2021

abgruszecki added this to the 3.0.0 milestone Apr 12, 2021

abgruszecki linked a pull request Apr 13, 2021 that will close this issue

Fix i11545 #12087

Closed

odersky removed this from the 3.0.0-RC3 milestone Apr 14, 2021

LPTK mentioned this issue Jul 15, 2021

Unsoundness in GADT pattern matching (falsely assumed injectivity) #13080

Open

OlivierBlanvillain added the area:gadt label Sep 23, 2021

LPTK mentioned this issue Sep 27, 2021

Adding full support of path-dependent GADT reasoning #13475

Closed

smarter added the unsoundness label Oct 4, 2021

abgruszecki added itype:soundness Soundness bug (it lets us compile code that crashes at runtime with a ClassCastException) and removed unsoundness labels Oct 4, 2021

LPTK mentioned this issue Apr 21, 2022

Avoid unsound GADT constraints derived from subtyping relations between singletons and refined types #14728

Merged

Linyxus mentioned this issue May 12, 2022

Try either when compare an intersection against an applied type in GADT mode #15175

Closed

Kordyjan modified the milestones: 3.0.1, 3.0.0 Aug 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overconstrained GADT bounds in presence of bad bounds #11545

Overconstrained GADT bounds in presence of bad bounds #11545

mario-bucev commented Feb 26, 2021 •

edited

Loading

LPTK commented Feb 27, 2021

LPTK commented Mar 1, 2021

liufengyun commented Mar 1, 2021

LPTK commented Mar 1, 2021 •

edited

Loading

liufengyun commented Mar 1, 2021

LPTK commented Mar 1, 2021 •

edited

Loading

liufengyun commented Mar 1, 2021

LPTK commented Mar 1, 2021

abgruszecki commented Mar 12, 2021

odersky commented Apr 6, 2021 •

edited

Loading

LPTK commented Apr 6, 2021

mario-bucev commented Apr 7, 2021

abgruszecki commented Apr 7, 2021

odersky commented Apr 14, 2021

LPTK commented Apr 14, 2021

LPTK commented Apr 14, 2021

smarter commented Apr 14, 2021

dwijnand commented Sep 10, 2021

dwijnand commented Sep 10, 2021

Overconstrained GADT bounds in presence of bad bounds #11545

Overconstrained GADT bounds in presence of bad bounds #11545

Comments

mario-bucev commented Feb 26, 2021 • edited Loading

Compiler version

Minimized code

Output

Expectation

LPTK commented Feb 27, 2021

LPTK commented Mar 1, 2021

liufengyun commented Mar 1, 2021

LPTK commented Mar 1, 2021 • edited Loading

liufengyun commented Mar 1, 2021

LPTK commented Mar 1, 2021 • edited Loading

liufengyun commented Mar 1, 2021

LPTK commented Mar 1, 2021

abgruszecki commented Mar 12, 2021

odersky commented Apr 6, 2021 • edited Loading

LPTK commented Apr 6, 2021

mario-bucev commented Apr 7, 2021

abgruszecki commented Apr 7, 2021

odersky commented Apr 14, 2021

LPTK commented Apr 14, 2021

LPTK commented Apr 14, 2021

smarter commented Apr 14, 2021

dwijnand commented Sep 10, 2021

dwijnand commented Sep 10, 2021

mario-bucev commented Feb 26, 2021 •

edited

Loading

LPTK commented Mar 1, 2021 •

edited

Loading

LPTK commented Mar 1, 2021 •

edited

Loading

odersky commented Apr 6, 2021 •

edited

Loading