Incorrect overload selected for Array(Array(T)) vs Array(T) #8973

asterite · 2020-03-29T20:31:11Z

def foo(a : Array(T)) forall T
  puts "one"
end

def foo(a)
  puts "two"
end

z = [5].as(Array(Array(Int32)) | Array(Int32))
foo(z)

Prints "two" but should print "one". I'm still not exactly sure why this happens.

asterite · 2020-03-29T20:33:02Z

Without the dependency on Array:

class Gen(T)
end

def foo(a : Gen(T)) forall T
  puts "one"
end

def foo(a)
  puts "two"
end

z = Gen(Int32).new.as(Gen(Gen(Int32)) | Gen(Int32))

foo(z)

stevensonmt · 2020-03-29T21:28:51Z

Not sure if it's relevant, but the bug appears inconsistent.

def foo(a : Array(T)) forall T
  a.each_with_index do |a, i|
    p "first call for #{a} of class #{a.class}"
    foo(a)
  end
end

def foo(a : T) forall T
  p "second call for #{a} of class #{a.class}"
  p a
end  

foo([1, [2, [[3]], [4, [[5]]]]])

"first call for 1 of class Int32"
"second call for 1 of class Int32"
1
"first call for [2, [[3]], [4, [[5]]]] of class Array(Array(Array(Array(Int32)) | Int32) | Array(Array(Int32)) | Int32)"
"first call for 2 of class Int32"
"second call for 2 of class Int32"
2
"first call for [[3]] of class Array(Array(Int32))"
"second call for [[3]] of class Array(Array(Int32))"
[[3]]
"first call for [4, [[5]]] of class Array(Array(Array(Int32)) | Int32)"
"first call for 4 of class Int32"
"second call for 4 of class Int32"
4
"first call for [[5]] of class Array(Array(Int32))"
"first call for [5] of class Array(Int32)"
"first call for 5 of class Int32"
"second call for 5 of class Int32"
5

stevensonmt · 2020-03-30T01:19:56Z

I'm not enough of a mathematician to do this correctly, but bear with me.
So numbers belong to the set N.
[[3]] and [[5]] belong to the same set or type A for N ⊂ A.
[2, [[3]]] and [4, [[5]]] would also belong to the same set or type B for which (A | N) ⊂ B.
So the array [2, [[3]], [4, [[5]]] is a collection C which is B(N | B). When the compiler sees this pattern it determines that N | B is the unit type T. When calling the first method on C it will call the second method on any B that is not B(B). Because [[3]] is A and is not B, the second method is called. Because [4, [[5]]] is B, the first method is called. For this instance the unit type T is no longer N | B, but rather N | A. Because A is defined as containing N but not itself, N is the unit type T for the first method called on A.

If any of that makes sense and is factual, the compiler's actions are surprising but correct. The expected output would only be possible if the compiler waits to define the unit type T until all possible recursion is done.

asterite · 2020-03-30T01:45:43Z

It might be a correct guess but there's no recursion going on.

It works wrong because we have a bug in our code.

stevensonmt · 2020-03-30T01:55:47Z

Recursion might be the wrong term for the algorithm process, but it does seem to have something to do with the recursive union type.
B(A|B) triggers it when C(A|B) does not.

jgaskins · 2020-03-31T14:35:31Z

Doing some old-fashioned puts debugging in the compiler and discovered this when I ran it against @stevensonmt's example code:

Crystal::Matches(
 @cover=true,
 @matches=
  [#<Crystal::Match:0x10cf32330
    @arg_types=[Array(Array(Int32))],
    @context=
     #<Crystal::MatchContext:0x10cf52900
      @def_free_vars=["T"],
      @defining_type=main,
      @free_vars={"T" => Array(Int32)},
      @instantiated_type=main,
      @strict=false>,
    @def=
     def foo(a : Array(T)) forall T
  a.each_with_index do |a, i|
    p("first call for #{a} of class #{a.class}")
    foo(a)
  end
end,
    @named_arg_types=nil>,
   #<Crystal::Match:0x10cf32300
    @arg_types=[(Array(Array(Int32)) | Int32)],
    @context=
     #<Crystal::MatchContext:0x10cf52800
      @def_free_vars=["T"],
      @defining_type=main,
      @free_vars={"T" => (Array(Array(Int32)) | Int32)},
      @instantiated_type=main,
      @strict=false>,
    @def=
     def foo(a : T) forall T
  p("second call for #{a} of class #{a.class}")
  p(a)
end,
    @named_arg_types=nil>],
 @owner=main,
 @success=true)

Crystal::Type#lookup_matches_without_parents is correctly matching Array(Array(Int32)) (the [[3]]) to the first method but also lumping it in with Int32 in a second match. I haven't figured out how that match is used yet, but it seems to be choosing the second match somewhere.

faustinoaq · 2020-03-31T14:38:02Z

Interesting bug, if I try this: https://carc.in/#/r/8t59

def foo(a : Array(T)) forall T
  a.each_with_index do |a, i|
    p "first call for #{a} of class #{a.class}"
    foo(a)
  end
end
 
def foo(a : Array(Array(Int32)))
  a.each_with_index do |a, i|
    p "first call for #{a} of class #{a.class}"
    foo(a)
  end
end
 
def foo(a : T) forall T
  p "second call for #{a} of class #{a.class}"
  p a
end  
 
foo([1, [2, [[3]], [4, [[5]]]]])

Then I get this:

"first call for 1 of class Int32"
"second call for 1 of class Int32"
1
"first call for [2, [[3]], [4, [[5]]]] of class Array(Array(Array(Array(Int32)) | Int32) | Array(Array(Int32)) | Int32)"
"first call for 2 of class Int32"
"second call for 2 of class Int32"
2
"first call for [[3]] of class Array(Array(Int32))"
"first call for [3] of class Array(Int32)"
"first call for 3 of class Int32"
"second call for 3 of class Int32"
3
"first call for [4, [[5]]] of class Array(Array(Array(Int32)) | Int32)"
"first call for 4 of class Int32"
"second call for 4 of class Int32"
4
"first call for [[5]] of class Array(Array(Int32))"
"first call for [5] of class Array(Int32)"
"first call for 5 of class Int32"
"second call for 5 of class Int32"
5

jgaskins · 2020-03-31T14:50:43Z

Looks like the bug is somewhere in this file. Haven't found it yet. I'll look some more after work today.

asterite · 2020-03-31T15:25:22Z

The problem is that this is a bug where the problem is not the code. We need to define how we want it to work first.

Here's the code again:

class Gen(T)
end

def foo(a : Gen(T)) forall T
  puts "one"
end

def foo(a)
  puts "two"
end

z = Gen(Int32).new.as(Gen(Gen(Int32)) | Gen(Int32))

foo(z)

The type of z is a union of:

Gen(Gen(Int32))
Gen(Int32)

When the compiler tries to match that union type against Gen(T) it will do this:

Take the first type in the union: Gen(Gen(Int32)). It matches against Gen(T). T becomes Gen(Int32).
The the second type in the union: Gen(Int32). Try to match it against Gen(T). However, we said T is Gen(Int32). So it's matching against Gen(Gen(Int32)). No match.

The only match is the first one, Gen(Gen(Int32)), where T is Gen(Int32).

Then the next overload is considered, and the entire union matches.

That's why we are seeing that behavior.

How it should work? Should T become the union of those types? But if T is Gen(Int32) | Int32 it would mean we had Gen(Gen(Int32) | Int32) in the first place, which is not the case.

We basically need to dispatch the union type to two different instantiations of the same overload. But the code never does that in any place of the compiler.

So this is a huge change, if we want to implement it.

Or... other ideas of how this should match?

jgaskins · 2020-03-31T15:43:34Z

This makes me question whether I understand forall. My mental model is that it instantiates a new overload for every single type it was called with, including splitting unions. For example:

def foo(a : T) forall T
end

foo [1, "b"].first

In this example, it seems that there would be 2 overloads for foo — one for Int32 and one for String. Is that correct or is there just one for Int32 | String?

RX14 · 2020-03-31T15:59:55Z

@jgaskins there is just one, for the union type.

It works exactly the same way without forall. forall doesn't change which instantiations occur, it just increases the expressiveness of type restrictions.

asterite · 2020-03-31T16:09:33Z

Exactly. forall just lets you talk about that T and use it in code later on. But it doesn't change how things match.

That said, in my mind forall should also expand union types and provide one overload for each, but that's not the case. And maybe we shouldn't do it. Should we do it for each subclass of a type hierarchy too?

Another thing to consider:

class Gen(T)
end

def foo(a : Gen(T)) forall T
  p T
end

foo(Gen(Int32).new || Gen(Char).new)

Currently fails:

Error: no overload matches 'foo' with type (Gen(Char) | Gen(Int32))

Overloads are:
 - foo(a : Gen(T))
Couldn't find overloads for these types:
 - foo(a : Gen(Int32))

It's because Gen(Char) matches, T becomes Char, then Gen(Int32) doesn't match and there's no overload to cover that.

The entire situation is very confusing.

My advice is to avoid using forall if not needed. In the example in the forums the T wasn't needed at all.

stevensonmt · 2020-03-31T16:09:51Z

The type of z is a union of:

Gen(Gen(Int32))

Gen(Int32)

When the compiler tries to match that union type against Gen(T) it will do this:

Take the first type in the union: Gen(Gen(Int32)). It matches against Gen(T). T becomes Gen(Int32).

The the second type in the union: Gen(Int32). Try to match it against Gen(T). However, we said T is Gen(Int32). So it's matching against Gen(Gen(Int32)). No match.

The only match is the first one, Gen(Gen(Int32)), where T is Gen(Int32).

Then the next overload is considered, and the entire union matches.

That's why we are seeing that behavior.

How it should work? Should T become the union of those types? But if T is Gen(Int32) | Int32 it would mean we had Gen(Gen(Int32) | Int32) in the first place, which is not the case.

We basically need to dispatch the union type to two different instantiations of the same overload. But the code never does that in any place of the compiler.

So this is a huge change, if we want to implement it.

Or... other ideas of how this should match?

Thanks for the clear delineation of the problem. I never expected to uncover such a fundamental bug. How important is this corner case?

asterite · 2020-03-31T16:20:02Z

How important is this corner case?

There's lucky framework, amber, athena and thousand of shards out there. We have a compiler. There's the standard library with a lot of features.

It's a bug but it's not preventing people from creating wonderful things. I think this is a very low priority bug. It's pretty rare to have to match against Array(T) and wanting to get that T. It's a bit leaky. And if you really need to do that, you can do it by doing typeof(array.first) or something like that too. So there are workarounds.

jgaskins · 2020-03-31T18:10:38Z

It works exactly the same way without forall. forall doesn't change which instantiations occur, it just increases the expressiveness of type restrictions.

Exactly. forall just lets you talk about that T and use it in code later on. But it doesn't change how things match.

Ahh, okay, similar to the idea of &block vs yield (purposely ignoring the block inlining for yield for this analogy), where they both match the exact same way but the &block version gives you a reference to the block itself but there's only one instantiation of the &block overload?

HertzDevil · 2021-01-30T09:51:52Z

Another thing to consider:

class Gen(T)
end

def foo(a : Gen(T)) forall T
  p T
end

foo(Gen(Int32).new || Gen(Char).new)

Currently fails:

Error: no overload matches 'foo' with type (Gen(Char) | Gen(Int32))

Overloads are:
 - foo(a : Gen(T))
Couldn't find overloads for these types:
 - foo(a : Gen(Int32))

If we dispatch here, the same can be said about a plain free var restriction:

def foo(x : T) forall T
  p T
end

foo(1 || 'a') # => Int32
foo('a' || 1) # => Char

as dynamic dispatch would generate separate overloads for T = Int32 and T = Char. IMO this is undesirable as it leads to semantic consequences like the following:

def foo(x : T) forall T
  [x]
end

# without dynamic dispatch:
foo(1 || 'a').class   # => Array(Char | Int32)
foo('a' || 1).class   # => Array(Char | Int32)
typeof(foo(1 || 'a')) # => Array(Char | Int32)
typeof(foo('a' || 1)) # => Array(Char | Int32)

# with dynamic dispatch:
foo(1 || 'a').class   # => Array(Int32)
foo('a' || 1).class   # => Array(Char)
typeof(foo(1 || 'a')) # => (Array(Char) | Array(Int32))
typeof(foo('a' || 1)) # => (Array(Char) | Array(Int32))

So I am inclined towards not allowing dynamic dispatch here; when one writes Gen(T) forall T, the static type of the argument must never be a union. Instead there is the following way to perform dynamic dispatch on a static union type:

class Gen(T)
end

def foo_impl(a : Gen(T)) forall T
  p T
end

def foo(x : T) forall T
  {% begin %}
    case x
      {% for t in T.union_types %}
      in {{ t }} then foo_impl(x)
      {% end %}
    end
  {% end %}
end

foo(Gen(Int32).new || Gen(Char).new) # => Int32
foo(Gen(Char).new || Gen(Int32).new) # => Char

foo(1) # Error: no overload matches 'foo_impl' with type Int32

Perhaps we could extract this idiom into a pseudo-method. A regular method probably wouldn't work; if we replace foo_impl(x) with yield x, the static type of the block argument would still be a union of Gens.

asterite · 2021-01-30T10:14:05Z

Great analysis, thank you! And I agree with you.

j8r · 2021-01-30T13:17:31Z

This issue is probably the same as mine, which does not involve generics:

record Child, str : String = "bar"
record Parent, str : String = "foo", nested : Child = Child.new

def serialize(obj : T, de_unionize = false) forall T
  String.build do |io|
    each_ivar obj do |ivar|
      io << (de_unionize ? de_unionize(ivar) : serialize(ivar))
    end
  end
end

def serialize(str : String)
  str
end

def each_ivar(object : O, &) forall O
  {% for ivar in O.instance_vars %}
    yield object.@{{ivar}}
  {% end %}
end

def de_unionize(object : U) forall U
  {% for u in U.union_types %}
    return serialize object if object.is_a? {{u}}
  {% end %}
end

p serialize Parent.new                    # => "foo"
p serialize Parent.new, de_unionize: true # => "foobar"

As @HertzDevil suggested, I did have to implement a method to de-unionize.

HertzDevil · 2021-02-13T00:45:41Z

So if Gen isn't equivalent to Gen(T) forall T in type restrictions, which one should Gen(_) be equivalent to?

HertzDevil · 2021-03-10T14:44:01Z

There is also the following:

def foo(x : Array(Array(T))) forall T
  puts "1, #{T}"
end

def foo(x : Array(T)) forall T
  puts "2, #{T}"
end

def foo(x : T) forall T
  puts "3, #{T}"
end

foo([[1]] || [1]) # => 1, Int32
foo([1] || [[1]]) # => 3, (Array(Array(Int32)) | Array(Int32))

If we follow the no dispatches approach, both calls should print 3, because Array(Array(T)) forall T and Array(T) forall T never match the static union type. But if the first overload doesn't involve a free var:

def foo(x : Array(Array(Int32)))
  puts "1, #{Int32}"
end

def foo(x : Array(T)) forall T
  puts "2, #{T}"
end

foo([[1]] || [1]) # Error: no overload matches 'foo' with type (Array(Array(Int32)) | Array(Int32))
foo([1] || [[1]]) # Error: no overload matches 'foo' with type (Array(Array(Int32)) | Array(Int32))

then the calls should print 1 and 2 respectively; that is, we still want to perform union dispatch in the same way the following is supported:

def foo(x : Array(Array(Int32)) | Array(T)) forall T
  puts "#{x}, #{T}"
end

foo([1] || [[1]]) # [1], Int32
foo([[1]] || [1]) # [[1]], Int32

HertzDevil · 2021-09-01T05:07:05Z

We could perform dispatch over free variables only when the usual unification fails. More specifically, suppose we have:

def foo(x : Array(T), y : Array(T)) forall T
end

If a type like Array(Int32) | Array(String) is used then unification will fail. In this case, we expand each restriction to a fictitious Union of copies of the original restriction, with the free variables in each copy renamed:

def foo(
  x : Array(T_11) | Array(T_12) | Array(T_13) | ... ,
  y : Array(T_21) | Array(T_22) | Array(T_23) | ...
) forall T_ij...
end

These Unions are matched against the argument types and every T_*j produces a different overload to be dispatched, with the additional requirement that if the same free variable is used in multiple parameters, only the common matches that appear on all such parameters (i.e. T_1* & T_2* & T_3* & ...) count as successful substitutions:

# T_1* = {Int32}
# T_2* = {Int32}
# T = {Int32}
foo([1], [1]) # note that normal unification should also succeed here

# T_1* = {Int32, String}
# T_2* = {Int32, String}
# T = {Int32, String}
# missing overloads for `foo([1], [""])` and `foo([""], [1])`
foo([1] || [""], [1] || [""])

# T_1* = {Int32, Char}
# T_2* = {Int32, String}
# T = {Int32}
# missing overloads for `foo([1], [""])`, `foo(['a'], [1])`, and `foo(['a'], [""])`
foo([1] || ['a'], [1] || [""])

# T_1* = {Int32 | String}
# T_2* = {String}
# T = ø
# missing overloads for everything
foo([1 || ""], [""])

def bar(x : Array(T), y : Array(U)) forall T, U
end

# T_1* = {Int32, String}
# U_1* = {Int32, String}
# T × U = {(Int32, Int32), (Int32, String), (String, Int32), (String, String)}
bar([1] || [""], [1] || [""])

# T_1* = {Int32 | String}
# U_1* = {String}
# T × U = {((Int32 | String), String)}
bar([1 || ""], [""])

At this stage any bare free variable restrictions expand union types to their power sets: (a similar treatment could be done for Tuple(T) and NamedTuple(a: T) provided they don't appear within other generics)

def foo(x : T, y : Array(T)) forall T
end

# T_1* = {NoReturn, Int32}
# T_2* = {Int32 | String}
# T = ø
# missing overloads for everything
foo(1, [1 || ""])

# T_1* = {NoReturn, Int32, String, Int32 | String}
# T_2* = {Int32 | String}
# T = {Int32 | String}
foo(1 || "", [1 || ""])

# T_1* = {NoReturn, Int32, String, Int32 | String}
# T_2* = {Int32}
# T = {Int32}
# missing overloads for `foo("", [1])`
foo(1 || "", [1])

These unions and power sets are only needed for a formal definition; the compiler probably won't create them.

I would like to add that the following should unify normally for any pairs of argument types (it currently doesn't):

def foo(x : T, y : T) forall T
  T
end

foo(1, "") # => Int32 | String

The standard library, to my knowledge, doesn't do this outside the Atomic primitives, but these also have Pointer(T) restrictions so matching would remain unaffected.

straight-shoota · 2021-09-01T08:17:35Z

Hm, I'm wondering if this would actually be correct.

I would interpret def foo(x : T, y : T) forall T that both arguments are supposed to have identical types. And not that the compiler should construct a union type to fit different types.

On the other hand, you can also just cast both arguments to Int32 | String explicitly. Maybe having the compiler do that implicitly wouldn't be too bad. But it still seems unintuitive.

What's the point in restricting both arguments to the same type then, if any pair of types would be acceptable?

HertzDevil · 2022-01-07T15:02:44Z

Revisiting this issue and you're probably right that T should restrict exactly to the argument type. In that case we have:

def foo(x : T, y : Array(T)) forall T
end

# T_1* = {Int32}
# T_2* = {Int32 | String}
# T = ø
# missing overloads for everything
foo(1, [1 || ""])

# T_1* = {Int32 | String}
# T_2* = {Int32 | String}
# T = {Int32 | String}
foo(1 || "", [1 || ""])

# T_1* = {Int32 | String}
# T_2* = {Int32}
# T = ø
# missing overloads for everything
foo(1 || "", [1])

# T_1* = {Int32 | String}
# T_2* = {Int32, String}
# T = ø
# missing overloads for everything
foo(1 || "", [1] || [""])

asterite added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:semantic labels Mar 29, 2020

HankAnderson mentioned this issue Jun 18, 2020

Error when extracting free variable from a union of generic types #9498

Closed

jwoertink mentioned this issue Jun 18, 2020

Compiler: implement autocasting in a better way #9501

Merged

HertzDevil mentioned this issue Jan 21, 2021

Consider free variables and underscores when determining overload ordering #10042

Closed

This was referenced Feb 6, 2021

Oddity when matching same free var against multiple arguments #10370

Open

Union type restrictions with multiple free vars #10371

Open

HertzDevil mentioned this issue Mar 14, 2021

Allow unions of same-type containers for certain container methods #10510

Draft

HertzDevil mentioned this issue May 11, 2021

Covariance and contravariance #3803

Open

HertzDevil mentioned this issue Aug 20, 2021

Enforce hash key types at compile time #8893

Draft

HertzDevil added the topic:compiler:generics label Sep 1, 2021

HertzDevil mentioned this issue Sep 24, 2021

Error: undefined constant U #11244

Closed

HertzDevil mentioned this issue Mar 18, 2022

Upper-bounded free variables in def restrictions #11908

Open

HertzDevil mentioned this issue Dec 2, 2022

Casting to generic from its non-generic supertype via an ivar leads to infinite loop in the compiler #12815

Open

HertzDevil mentioned this issue Dec 9, 2022

Type restriction against default value #12829

Open

HertzDevil mentioned this issue Apr 26, 2023

Unions of same-type collections cannot be passed to methods with free variables #13398

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect overload selected for Array(Array(T)) vs Array(T) #8973

Incorrect overload selected for Array(Array(T)) vs Array(T) #8973

asterite commented Mar 29, 2020

asterite commented Mar 29, 2020

stevensonmt commented Mar 29, 2020

stevensonmt commented Mar 30, 2020

asterite commented Mar 30, 2020

stevensonmt commented Mar 30, 2020

jgaskins commented Mar 31, 2020

faustinoaq commented Mar 31, 2020

jgaskins commented Mar 31, 2020

asterite commented Mar 31, 2020

jgaskins commented Mar 31, 2020

RX14 commented Mar 31, 2020 •

edited

Loading

asterite commented Mar 31, 2020

stevensonmt commented Mar 31, 2020 •

edited

Loading

asterite commented Mar 31, 2020 •

edited

Loading

jgaskins commented Mar 31, 2020

HertzDevil commented Jan 30, 2021

asterite commented Jan 30, 2021

j8r commented Jan 30, 2021

HertzDevil commented Feb 13, 2021

HertzDevil commented Mar 10, 2021

HertzDevil commented Sep 1, 2021 •

edited

Loading

straight-shoota commented Sep 1, 2021

HertzDevil commented Jan 7, 2022

Incorrect overload selected for Array(Array(T)) vs Array(T) #8973

Incorrect overload selected for Array(Array(T)) vs Array(T) #8973

Comments

asterite commented Mar 29, 2020

asterite commented Mar 29, 2020

stevensonmt commented Mar 29, 2020

stevensonmt commented Mar 30, 2020

asterite commented Mar 30, 2020

stevensonmt commented Mar 30, 2020

jgaskins commented Mar 31, 2020

faustinoaq commented Mar 31, 2020

jgaskins commented Mar 31, 2020

asterite commented Mar 31, 2020

jgaskins commented Mar 31, 2020

RX14 commented Mar 31, 2020 • edited Loading

asterite commented Mar 31, 2020

stevensonmt commented Mar 31, 2020 • edited Loading

asterite commented Mar 31, 2020 • edited Loading

jgaskins commented Mar 31, 2020

HertzDevil commented Jan 30, 2021

asterite commented Jan 30, 2021

j8r commented Jan 30, 2021

HertzDevil commented Feb 13, 2021

HertzDevil commented Mar 10, 2021

HertzDevil commented Sep 1, 2021 • edited Loading

straight-shoota commented Sep 1, 2021

HertzDevil commented Jan 7, 2022

RX14 commented Mar 31, 2020 •

edited

Loading

stevensonmt commented Mar 31, 2020 •

edited

Loading

asterite commented Mar 31, 2020 •

edited

Loading

HertzDevil commented Sep 1, 2021 •

edited

Loading