Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NamedArgument.name == NamedArgument.value in method_missing #10381

Closed
jgaskins opened this issue Feb 10, 2021 · 28 comments · Fixed by #10388
Closed

NamedArgument.name == NamedArgument.value in method_missing #10381

jgaskins opened this issue Feb 10, 2021 · 28 comments · Fixed by #10388

Comments

@jgaskins
Copy link
Contributor

jgaskins commented Feb 10, 2021

class Thing
  macro method_missing(call)
    "{{call.named_args}}"
  end
end

pp Thing.new.omg(foo: "bar")

The expected output here is "[foo: \"bar\"]", but instead it's "[foo: foo]". It seems like both name and value on Crystal::Macros::NamedArgument are returning the same value. You can see that in the following code:

class Thing
  macro method_missing(call)
    {% arg = call.named_args.first %}
    "{{arg.name}} - {{arg.value}}"
  end
end

Thing.new.omg(foo: "bar") # => "foo - foo"
@matthewmcgarvey
Copy link
Contributor

I've though of the value of the named argument as a variable reference to the real value, but thinking about it now, there's no difference between the name and the value in this case.

The only difference I've seen is that if the named arg is like: Thing.new.omg("foo": "bar") the name is "foo" but the value is foo without the quotations which is consequently invalid so the difference doesn't work

@jgaskins
Copy link
Contributor Author

With macros, you're typically getting an expression (rather than a value) rendered in-place into the macro's call site. You can see similar behavior with instance variable default values.

class Person
  @name : String = pp "Jamie"
end

3.times { Person.new }
# "Jamie"
# "Jamie"
# "Jamie"

This isn't happening inside of a method call that we wrote, it's just assigning @name = pp "Jamie" because the assignment assigns the result of an expression rather than the value. Macros work the same way.

This is why this seems like a bug. I'd expect the Call object in method_missing to have foo: "bar" as the NamedArgument. In the following code, it does:

macro foo(call)
  pp {{call.named_args.first.value}}
end

foo(not_even_a_real_method(foo: "bar")) # "bar"

The expression passed to the foo macro isn't an actual method, but it doesn't matter because we're passing it to a macro that isn't actually invoking the expression. It's just showing information about it.

@HertzDevil
Copy link
Contributor

When the compiler invokes the method_missing hook, it must generate a fake call expression that, roughly speaking, has the same shape as the call that actually triggered it:

class Foo
  macro method_missing(call)
    {% p call %}
  end
end

Foo.new.foo(1, 'a', x: "") { |f, g| }
Foo.new.foo(1, 'a', x: "") { |f, g| } # prints nothing because this shape has been handled
Foo.new.foo(2, 'a')
foo(_arg0, _arg1, x: x) do |_block_arg0, _block_arg1|
  yield _block_arg0, _block_arg1
end
foo(_arg0, y: y)

call is this fictitious expression, not the original call; you cannot access the arguments during compile-time. However, the fix is easy:

class Thing
  macro method_missing(call)
    {% arg = call.named_args.first %}
    "{{arg.name}} - #{ {{arg.value}} }"
  end
end

Thing.new.omg(foo: "bar"# => "foo - bar"

@asterite
Copy link
Member

@HertzDevil is right here. method_missing will generate just one instance of the method per call "shape" (signature). That's the arguments and their types. So to give another example:

class Thing
  macro method_missing(call)
    {{ puts call }}
  end
end

thing = Thing.new
thing.omg(foo: "bar")
thing.omg(foo: "baz")
thing.omg(foo: "qux")

This will puts call at compile-time just once, because every call has a similar shape: just a named argument named foo and it's always a String.

This is done to prevent creating one method per actual call. Alternatively the call could be inlined and no method would be generated, but that's not how it works right now, and it's a bit harder to implement.

That said, I've been wanting to remove method_missing from the language. I don't think it's implemented well, it has limitations (for example you can't capture blocks) and in general it leads to a poor API design (what can I do with a type that I don't know what methods it has? what are the docs? what are the accepted types?)

So in Lucky I would try to avoid method_missing as much as possible. If other team members agree, let's remove method_missing.

@straight-shoota
Copy link
Member

=> #4701

@Sija
Copy link
Contributor

Sija commented Feb 10, 2021

@asterite Removing sth that provides value just because you can't think of a way to fix the quirks sounds IMO like a wrong reason to do so.

@asterite
Copy link
Member

@Sija Okay. Then someone needs to come up with a way to implement method missing. If nobody comes up with that then it will be forever broken.

I usually like to remove broken things and reintroduce them once they work correctly. Because if it turns out that there's no way to fix it, then at least it's not broken in the first place (because the feature is not there).

@asterite
Copy link
Member

To put things in perspective: in the beginning, as we saw the language was actually working we experimented with how much Ruby we could get out of it. "Can we have something like method_missing?". So we came up with something. But we actually never needed that. And when coding in Ruby I personally don't use method_missing. So it's just there because Ruby has it and we wanted to see if we can have it too.

Over the time we found many bugs and issues with it and I tried to fix those. But it's never enough.

The point is, we have to maintain a feature that just happens to exist because of experimentation, not actual need.

@naqvis
Copy link
Contributor

naqvis commented Feb 10, 2021

Totally agree with Ary and put my vote to remove such features. IMHO such features are more abused then normal used for experimentation, people with strong scripting or dynamic languages background utilizes such methods/hacks to treat or expect language to behave in similar fashion they were used to. I understand Ruby has very strong influence on design of Crystal, but Crystal ain't Ruby.

@jgaskins
Copy link
Contributor Author

jgaskins commented Feb 10, 2021

It sounds like method_missing isn't the same kind of macro that other macros are. I know all the "hook" macros are a bit of a special case, but with this context this one seems even weirder. I assumed the type checker would detect that the object didn't have that method defined and then inject the method_missing macro body into it. For example:

class Thing
  macro method_missing(call)
    p self
  end
end

# My mental model of it was that this line:
Thing.new.foo("bar")

# ... would compile to something resembling this:
with Thing.new do
  p self
end

This way there would be no new methods created at all. It'd just be a macro like other macros (minus the part where it's executed in the context of the original receiver).

@jgaskins
Copy link
Contributor Author

Is there any reason it couldn't compile to that?

@asterite
Copy link
Member

@jgaskins Yes: instance variables don't work in a with ... do block.

@asterite
Copy link
Member

Also with ... do would have access to the outer block.

I think it could be implemented with something like that, it's not clear to me how. When the compiler tries to resolve a call the resolution is a method. If a method doesn't exist it hits method_missing, and method_missing will create a method that can then be called. It never works like "oh, if there's no call then we'll just replace this entire call with something else". That's the difficult thing of implementing this. Couple that with the fact that the method could have a block and then it's less clear how this can be implemented.

@jgaskins
Copy link
Contributor Author

instance variables don't work in a with ... do block.

Whoa, TIL. I've only maybe used with ... do twice, though.

Also with ... do would have access to the outer block.

I feel like that would be fair. Other Crystal macros do the same. For example, a template rendered by ECR.embed can use all the local variables defined in the lexical scope of the call site.

I think it could be implemented with something like that, it's not clear to me how. When the compiler tries to resolve a call the resolution is a method. If a method doesn't exist it hits method_missing, and method_missing will create a method that can then be called. It never works like "oh, if there's no call then we'll just replace this entire call with something else". That's the difficult thing of implementing this. Couple that with the fact that the method could have a block and then it's less clear how this can be implemented.

It's definitely a tricky concept with a lot of nuance. And I get that it's difficult to pin down, but when it comes to proxying other objects (for example, a client with a connection pool passing calls along to an individual connection), method_missing is the only concept I know of that makes that idea dead-simple to implement. Without it, you'd have to have a bunch of misdirection with mixins, class inheritance, or explicitly defining all the proxied methods.

In the file at the link above, I could probably have used the mixin architecture I already use to distinguish between serial / pipelined / transactional execution, but method_missing is actually more expressive there. And the API docs for that class still make it reasonably clear what's going on (though they are incomplete because I get distracted easily 😂).

@asterite
Copy link
Member

I think the majority of the use cases for method_missing are to forward calls to another object, or something. If that's the case, maybe instead of method_missing the language should provide a way to do that specific thing. For example in Go you can have methods be delegated to structs you include, or something like that.

@straight-shoota
Copy link
Member

the language should provide a way to do that specific thing

We should probably focus on that before the discussion about removing method_missing can move forward. An alternative for the common use case of delegation needs to be available first.

@Sija
Copy link
Contributor

Sija commented Feb 10, 2021

Sth along the lines of #9927 perhaps?

@Fryguy
Copy link
Contributor

Fryguy commented Feb 12, 2021

call is this fictitious expression, not the original call; you cannot access the arguments during compile-time. However, the fix is easy

@HertzDevil In the forum post, @matthewmcgarvey shared some code, that fails to compile when using this technique. However, the compilation error isn't what I expected, and I think that's part of what's leading to some confusion.

Reduced:

class Foo
  macro method_missing(call)
    "{{call.named_args.first.name}} = #{ {{call.named_args.first.value}} }"
  end
end
p Foo.new.foo(a: 1)    # => "a = 1"
p Foo.new.foo("a": 1)  # => "a = 1"
p Foo.new.foo("@a": 1) # Compilation Error:
# ...
#  > 1 | "@a = #{ @a }"
#                 ^-
# Error: can't infer the type of instance variable '@a' of Foo
#
# The type of a instance variable, if not declared explicitly with
# ...

The key part, I think, is that it's taking a NamedTuple that has a key of "@a", but when generating the fake call, doesn't that mean it would generate a named arg of @a: @a, which is invalid syntax? The compilation error presented isn't that it generated invalid method syntax, which is what I'd expect, instead giving a type error when it tries to resolve @a as an instance variable.

(Total aside, is it possible to expand a method_missing with crystal tool expand? I can't figure it out. Was trying to use that to see what was generated, but alas.)

EDIT: There's lots of fun things to play with - "@@a", "a:", "a-1"

@Fryguy
Copy link
Contributor

Fryguy commented Feb 12, 2021

A rough patch like this:

diff --git a/src/compiler/crystal/semantic/method_lookup.cr b/src/compiler/crystal/semantic/method_lookup.cr
index 60eb553a2..6fb7a3991 100644
--- a/src/compiler/crystal/semantic/method_lookup.cr
+++ b/src/compiler/crystal/semantic/method_lookup.cr
@@ -52,6 +52,10 @@ require "../types"

 module Crystal
   record NamedArgumentType, name : String, type : Type do
+    def initialize(@name : String, @type : Type)
+      raise "Invalid NamedArgumentType" unless @name =~ /\A[a-zA-Z0-9_]+\z/
+    end
+
     def self.from_args(named_args : Array(NamedArgument)?, with_literals = false)
       named_args.try &.map do |named_arg|
         new(named_arg.name, named_arg.value.type(with_literals: with_literals))

allows the "good" values to go through, while blowing up when the "bad", non-identifier, is passed.

I'm not sure how to code up a validation like this in an elegant way, however, such that it doesn't appear like a compiler error.

@matthewmcgarvey
Copy link
Contributor

Wouldn't that raise an error when a double splat is used on a normal method?

def foo(**named_args)
  pp named_args
end

foo("@a": 1)

@Fryguy
Copy link
Contributor

Fryguy commented Feb 12, 2021

Yes...like I said it's a "rough patch" 😄

@Fryguy
Copy link
Contributor

Fryguy commented Feb 12, 2021

Less invasive:

diff --git a/src/compiler/crystal/semantic/method_missing.cr b/src/compiler/crystal/semantic/method_missing.cr
index ceb95571a..e79b39faa 100644
--- a/src/compiler/crystal/semantic/method_missing.cr
+++ b/src/compiler/crystal/semantic/method_missing.cr
@@ -40,6 +40,8 @@ module Crystal
       if named_args = signature.named_args
         args_nodes_names << ""
         named_args.each do |named_arg|
+          raise "Invalid NamedArgument name: #{named_arg.name}" unless named_arg.name =~ /\A[a-zA-Z0-9_]+\z/
+
           named_args_nodes ||= [] of NamedArgument
           named_args_nodes << NamedArgument.new(named_arg.name, MacroId.new(named_arg.name))
           args_nodes_names << named_arg.name

EDIT: Alternatively, perhaps it's more elegant to put the validation in Arg

@HertzDevil
Copy link
Contributor

HertzDevil commented Feb 12, 2021

The fix here is also simple: use _named_arg#{i} as the variable names of the named parameters and arguments, instead of copying the key names. method_missing should continue to work as long as the key names remain the same. Code that uses NamedArgument#value will then work, but code that refers to the named arguments by key name will unconditionally break. Like these compiler specs:

it "works with named arguments, using names (#3654)" do
run(%(
class A
macro method_missing(call)
x &+ y
end
end
a = A.new
a.b(x: 1, y: 2)
)).to_i.should eq(3)
end

it "works with named arguments, named args in call (#3654)" do
run(%(
class A
macro method_missing(call)
{{call.named_args[0].name}} &+
{{call.named_args[1].name}}
end
end
a = A.new
a.b(x: 1, y: 2)
)).to_i.should eq(3)
end

Support for named arguments as a whole was originally implemented in #3654, but IMO we really shouldn't be allowing x in that first spec to refer to a named argument. There are other ways to break method_missing even with legal key names:

class Foo
  getter to = 1
  
  forward_missing_to to
end

# forward_missing_to expands to `to.upto(to: to)`
# method lookup finds Int#upto(to) because the parameter name matches
# the first and last `to` refer to the call's named argument, which is `3`
# a true delegation would have ultimately called `1.upto(3)` instead
Foo.new.upto(to: 3).to_a # => [3]
class Foo
  macro method_missing(call)
    self
  end
end

Foo.new.foo(self: 1)
# Module validation failed: Function return type does not match operand type of return inst!
#   ret %Foo* %56, !dbg !75
#  i32 (Exception)
#   from /crystal/src/int.cr:528:9 in 'finish'
#   from /crystal/src/compiler/crystal/codegen/codegen.cr:69:7 in 'codegen'
#   from /crystal/src/compiler/crystal/compiler.cr:172:16 in 'compile'
#   from /crystal/src/compiler/crystal/command.cr:210:5 in 'run_command'
#   from /crystal/src/compiler/crystal/command.cr:117:7 in 'run'
#   from /crystal/src/compiler/crystal.cr:11:1 in '__crystal_main'
#   from /crystal/src/crystal/main.cr:110:5 in 'main'
#   from src/env/__libc_start_main.c:94:2 in 'libc_start_main_stage2'

Instead, the real arguments must be referred to using {{ call.named_args[...].value }}. I think this is a good breaking change to make, though I don't know if it would make its way into 1.0.

@HertzDevil
Copy link
Contributor

HertzDevil commented Feb 12, 2021

To further elaborate, when we do

class A 
  macro method_missing(call) 
    x + y 
  end
end

A.new.b(x: 1, y: 2)

The compiler should behave as if the following def is emitted:

class A
  def b(*, x _named_arg0, y _named_arg1)
    x + y
  end
end

Then compilation fails because x and y are external, not internal parameter names. (see below) Instead, if we use NamedArgument#value:

class A 
  macro method_missing(call) 
    {{call.named_args[0].value}} + {{call.named_args[1].value}}
  end
end

It will be expanded to:

class A
  def b(*, x _named_arg0, y _named_arg1)
    _named_arg0 + _named_arg1
  end
end

@Fryguy
Copy link
Contributor

Fryguy commented Feb 12, 2021

A.new.b("@x": 1, y: 2)

would still produce invalid code, though, right? Something like

class A
  def b(*, @x _named_arg0, y _named_arg1)
    _named_arg0 + _named_arg1
  end
end

@HertzDevil
Copy link
Contributor

HertzDevil commented Feb 12, 2021

Quoted strings are allowed as external names:

def b(*, "@x" _named_arg0, "Y" _named_arg1)
  _named_arg0 + _named_arg1
end

b("@x": 1, Y: 2) # => 3

But it's not important here; method_missing doesn't have to deal with this because it operates over AST nodes directly instead of performing that macro interpolation in user code.

Besides, if one has to use method_missing or forward_missing_to then it is extremely unlikely that all possible missing calls share the same named parameter. So allowing that raw x and y inside method_missing is probably not a good idea to begin with anyway.

@Fryguy
Copy link
Contributor

Fryguy commented Feb 12, 2021

I had no idea param names could be enquoted like that...that's really cool.

@HertzDevil
Copy link
Contributor

Actually, if we have:

class A
  def b(*, x _named_arg0, y _named_arg1)
    x + y
  end

  macro method_missing(call) 
    x + y 
  end
end

then x and y are interpreted as method calls on A, so these are further expanded as:

class A
  def x
    x + y
  end

  def y
    x + y
  end
end

It gives a runtime stack overflow, but there really isn't anything to do about this since method_missing is simply doing its job. In Ruby the same thing happens:

class A
  def method_missing(m, *args, **kwargs, &block)
    # first call:       m = :b, args = [], kwargs = {:x => 1, :y => 2}
    # subsequent calls: m = :x, args = [], kwargs = {}
    x + y
  end
end

A.new.b(x: 1, y: 2) # SystemStackError (stack level too deep)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants