Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix scope related bug #53

Closed
eddycharly opened this issue Mar 3, 2023 · 39 comments · Fixed by #66
Closed

fix scope related bug #53

eddycharly opened this issue Mar 3, 2023 · 39 comments · Fixed by #66

Comments

@eddycharly
Copy link
Collaborator

This test should fail:

	{
		args: args{
			expression: `let({"b": @.a}, &a.b)`,
			data: map[string]interface{}{
				"a": 42.0,
			},
		},
		want: 42.0,
	}

Once &a has been looked up from the scope, we should not use the scope again to lookup .b.

@springcomp am I right thinking we have a bug here ?

@springcomp
Copy link
Contributor

springcomp commented Mar 3, 2023

@eddycharly that’s a good question and my interpretation of the spec (which I did not write) is that it should still be using the scope up for the entire duration of the let() evaluation.

So indeed:

  • a -> 42.0
  • b
    • Is evaluated based on the current context, which returns null
    • Therefore, it is looked up in the scope, and is found to be 42.0

So I do not think that is a bug.
But there is definitely room for interpretation, which is not satisfying.

@springcomp
Copy link
Contributor

The non specified behaviour is actually triggered by this part of the spec:

Parent scopes are created by nested let() calls.

Which implicitely can be interpreted as if a scope has a lifetime tighed to the entire duration of the let() function invocation.

@eddycharly
Copy link
Collaborator Author

My interpretation is that once the scope chain was used to lookup something, this something becomes the only thing available for further processing.

With let({"b": @.a}, &a.b):

  • a -> no lookup has occurred before (current context is { a: 42 } and scope chain is { b: { a: 42 } }), we lookup a, current context becomes 42 and scope chain becomes null
  • .b -> there's no .b in 42, result is null

From what I understand scope chain applies only until something is looked up from the chain.

How can we clarify this point of the spec ?

@eddycharly
Copy link
Collaborator Author

I'm definitely not comfortable with the current behaviour :(

@springcomp
Copy link
Contributor

From what I understand scope chain applies only until something is looked up from the chain.

For clarity, my understanding of your interpretation is that the scope chain should apply:

  • Only until something is looked up from the chain, and returns successful evaluation,
  • At the end of the current invocation for let() one level is discarded from the stack and the scope is available again.

Consider this compliance tests (which I did not write either):

Given:

{
  "nested": {
    "a": {
      "b": {
        "c": {
          "fourth": "fourth"
        },
        "third": "third"
      },
      "second": "second"
    },
    "first": "first"
  }
}

The following expression:

nested.let({level1: first}, &a.let({level2: second}, &b.let({level3: third}, &c.{first: level1, second: level2, third: level3, fourth: fourth})))

Returns:

{
  "first": "first",
  "second": "second",
  "third": "third",
  "fourth": "fourth"
}

In the third nested level, the multi-hash expression tries to evaluate the expression:

{ first: level1, second: level2, third: level3, fourth: level4 }.

  • level1 is not found in the current context, therefore is looked up in the scope chain.
  • level1 is found by walking up the scope chain and evaluates to first.

Is your interpretation so that at this point, the scope chain should be considered as not available for the rest of the evaluation?
Or is your interpretation more in line with the following nested level sub-expression:

{ first: level1.level2, second: level2, third: level3, fourth: level4 }

Returning:

{
  "first": null,
  "second": "second",
  "third": "third",
  "fourth": "fourth"
}

I think that’s a point worth clarifying indeed.

How can we clarify this point of the spec ?

We need to discuss the JEP with the wider community – although there is not much traction at the moment.
The tricky thing is that the spec has been designed by James who appear to no longer provide feedback.

@eddycharly
Copy link
Collaborator Author

Kindly ping, @jamesls would you mind clarifying the expected behaviour here ? 🙏 🤞

@springcomp
Copy link
Contributor

@mtdowling 🙏

@springcomp
Copy link
Contributor

springcomp commented Mar 3, 2023

@eddycharly for context:

jmespath/jmespath.site#6

This bit in particular addresses your concerns.

@eddycharly
Copy link
Collaborator Author

@springcomp thanks for the links !

Unfortunately I still find the current behaviour quite confusing.

The way we stack scopes is pretty clear but the way we use the scopes to resolve fields is not.
I see proposals in the discussion to use a dedicated token but it seems to bring other problems too.

I wonder if it would make sense to use a function to enable lookup in scopes instead 🤔

@eddycharly
Copy link
Collaborator Author

In my understanding, the result of evaluating the expression:

nested.let({level1: first}, &a.let({level2: second}, &b.let({level3: third}, &c.{first: level1, second: level2, third: level3, fourth: fourth})))

would be:

{
  "first": null,
  "second": null,
  "third": null,
  "fourth": "fourth"
}

because &c.{...} is looking up c which is found in the current context and overrides scopes previously defined.

@eddycharly
Copy link
Collaborator Author

eddycharly commented Mar 3, 2023

This could be rewritten to:
nested.let({level1: first}, &a.let({level2: second}, &b.let({level3: third}, &c.let({level4: fourth}, &{first: level1, second: level2, third: level3, fourth: level4}))))

To get the expected behaviour though.

@springcomp
Copy link
Contributor

springcomp commented Mar 3, 2023

While I agree the current behaviour with sub-expression is unclear, I think it’s reasonable and intuitive to expect a multi-select-hash to be thought of as evaluating disctint expressions, each one having access to the original scope and context. A bit like evaluating each argument to a function does not make the context suddenly disappear / be more restricted for subsequent function arguments.

Specifically, I would expect the full original context and scope chain to be available when evaluating:

  • Each property of a multi-select-hash
  • Each element of a multi-select-list
  • Each argument of a function
  • Each evaluating stream in a projection
  • etc.

@eddycharly
Copy link
Collaborator Author

We could also use a function for that:
nested.let({level1: first}, &a.let({level2: second}, &b.let({level3: third}, &c.ref(&{first: level1, second: level2, third: level3, fourth: fourth}))))

In the expression above the ref function is used to enable scope lookups.

@springcomp
Copy link
Contributor

springcomp commented Mar 4, 2023

We could also use a function for that […]

I would favor a design where the let() function works as a standalone feature and does not require additional function support. In fact, the more I think of this, the more I’m convinced we are not far off and only need minor adjustments.

For instance, I will try to illustrate what I think are entirely reasonable and actually quite intuitive expressions and their results – using the ✅ symbol – with contrast to what I think would be counter-intuitive – using the ❌ symbol – hereafter.

Please, note that all the intuitive (to me) examples below are currently working as I would expect them to.

multi-select-hash

  • search( let( { foo: 'foo', bar: 'bar' }, &{ foo: foo, bar: bar } ) , null ) -> { "foo": "foo", "bar": "bar" }
  • search( let( { foo: 'foo', bar: 'bar' }, &{ foo: foo, bar: bar } ) , null ) -> { "foo": "foo", "bar": null }

multi-select-list

  • search( let( { foo: 'foo', bar: 'bar' }, &[ foo, bar ] ) , null ) -> [ "foo", "bar" }
  • search( let( { foo: 'foo', bar: 'bar' }, &[ foo, bar ] ) , null ) -> [ "foo", null }

projections

  • let( { prefix: 'prefix' }, &[*].join('-', [prefix, @]) , [ "one", "two" ] ) -> [ "prefix-one", "prefix-two" ]
  • let( { prefix: 'prefix' }, &[*].join('-', [prefix, @]) , [ "one", "two" ] ) -> [ "prefix-one", null ]

Some more contrived examples:

search(
  let( { foo: 'foo', lo: 'length of', is: 'is' } , &[*].let( { str: @, len: to_string(length(@)) }, &join(' ', [lo, str, is, len]) ) ),
  [ "one", "two", "three" ]
) -> [ "length of one is 3", "length of two is 3", "length of three is 5" ] 

⬆️✅

search(
  let( { foo: 'foo', lo: 'length of', is: 'is' } , &[*].let( { str: @, len: to_string(length(@)) }, &join(' ', [lo, str, is, len]) ) ),
  [ "one", "two", "three" ]
) -> [ null, null, null ] 

⬆️❌

In fact, this expression should be completely equivalent to:

search(
  let( { foo: 'foo', lo: 'length of', is: 'is' } , &[*].join(' ', let( { str: @, len: to_string(length(@)) }, &[lo, str, is, len] ) ) ),
  [ "one", "two", "three" ]
) -> [ "length of one is 3", "length of two is 3", "length of three is 5" ] 

⬆️✅

Which I expect would still work with your interpretation.

Additionnaly, I think this would render the whole feature useless as you may want to create a nested scope from the outer scope. For instance, like so:

  • search( let( { foo: 'foo' }, &let( { foobar: join('', [foo, 'bar']) }, &[foo, foobar] ) ), null ) -> [ "foo", "foobar" ]

Whereas, I’m not sure in which way you would expect this expression to fail.
Does it fail because foo has already been looked-up successfully while constructing the inner scope?
Or does it fail because foo has been lookup-up successfully as part of creating the target multi-select-list?

  • search( let( { foo: 'foo' }, &let( { foobar: join('', [foo, 'bar']) }, &[foo, foobar] ) ), null ) -> [ null, null ]
  • search( let( { foo: 'foo' }, &let( { foobar: join('', [foo, 'bar']) }, &[foo, foobar] ) ), null ) -> [ "foo", null ]

@springcomp
Copy link
Contributor

springcomp commented Mar 4, 2023

I think the main – if not only – source of confusion comes from the arguably counter-intuitive behaviour when evaluating the RHS of a sub-expression. In that case, I would be happy to close that source of confusion. This would be quite easy (I think) to implement, as each new sub-expression would create a new fresh scope chain.

Something like:

    case parsing.ASTSubexpression:
      left, err := intr.Execute(node.Children[0], value)
      if err != nil {
        return nil, err
      }
      if left == nil {
        return nil, nil
      }
-    return intr.Execute(node.Children[1], left)
+    return intr.WithScope(nil).Execute(node.Children[1], left)

@eddycharly
Copy link
Collaborator Author

I'm definitely not sure what should be done.

Right now, I'm considering adding a configuration option to let the user decide if he wants to support let or not.

@springcomp
Copy link
Contributor

I'm definitely not sure what should be done.

The solution might indeed not be as simple as I thought in my post above.

Right now, I'm considering adding a configuration option to let the user decide if he wants to support let or not.

I would really like we work together towards improving / clarifying the spec. Library authors are indeed free to offer additional features but I think it would be best to prevent fragmentation. At the very worse, such a flag would need to be opt out rather than opt in which might not be satisfying to you.

I agree it would be great to close the sub expression confusion.

But let's focus on my examples above.
Irrespective of the implementation is there any green checkbox that you would argue against ?

@jamesls
Copy link

jamesls commented Mar 10, 2023

Kindly ping, @jamesls would you mind clarifying the expected behaviour here ? 🙏 🤞

A comment I made in the original proposal called out that it's not entirely clear what the expected behavior should be.

In seems like there's a lot of interest in this feature, so I updated the original tracking issue with a new proposal that's essentially "just use the let <binding> in <expr> syntax" that a lot of functional programming languages already use. Easy to explain and immediately understandable to anyone that's used a functional programming language before.

More details here, let me know what you think: jmespath/jmespath.site#6 (comment)

@eddycharly
Copy link
Collaborator Author

eddycharly commented Mar 10, 2023

Thanks @jamesls !

I asked a small question in jmespath/jmespath.site#6 directly.

It looks close to what I had in mind #53 (comment)

@springcomp
Copy link
Contributor

springcomp commented Mar 22, 2023

@eddycharly the more I study this, the more I’m convinced we are on the right track.

Indeed, I am convinced that we need chained scope objects, organized in stack to demonstrate the full potential for this feature. The following discussion demonstrates that we need stacked scope objects, unless I am missing something.

Additionally, I came to the realization that:

  • I’m not at all sold to introducing keywords into the language. I’m inclined to oppose it actually.
  • Discussing an alternative proposal could prove a lengthy process.
  • JMESPath Community brings its value with its ecosystem of implementation libraries.
  • JEP-11 brings more benefits in my opinion than it introduces confusion.
  • This repository is the only library currently not supporting JEP-11.

Therefore, may I kindly request that you please re-introduce JEP-11 into this repository?

At the same time, I will obsolete JEP-11 in favor of my preferred JEP-11a option which, for reference is:

  • Exactly the same as JEP-11, including having scope objects organized in a stack.
  • With the exception that looking up an identifier should be made explicit using the $<identifier> construct.

My goal is to write and publish JEP-11a as soon as I can to the best of my abilities.

@eddycharly
Copy link
Collaborator Author

Indeed, I am convinced that jmespath-community/jmespath.spec#161 (comment) scope objects, organized in stack to demonstrate the full potential for this feature

Strangely enough I'm convinced with the opposite. To me accessing something from a parent scope does not sound natural. It doesn't work this way in most programming languages, functions have a well defined signature and you never access variables declared in the caller implicitly.

Therefore, may I kindly request that you please re-introduce JEP-11 into this repository?

I will add it back with an option to turn it off. As an end user I don't want to ship our product with support for such a confusing syntax.

Exactly the same as JEP-11, including having scope objects organized in a stack.
With the exception that looking up an identifier should be made explicit using the $ construct.

It would be nice to get this discussed, if we want to go this way I think I prefer the introduction of a keyword based syntax just because it becomes possible to check bindings are valid at compile time.
Again, when we ship our product we want to be able to validate expressions as much as we can.

@springcomp
Copy link
Contributor

Strangely enough I'm convinced with the opposite. To me accessing something from a parent scope does not sound natural. It doesn't work this way in most programming languages, functions have a well defined signature and you never access variables declared in the caller implicitly.

In the sake of hearing all arguments, can you please try and implement the scenario referred to above with your proposal?

I will add [JEP-11] back with an option to turn it off. As an end user I don't want to ship our product with support for such a confusing syntax.

An option to turn this feature off would be a good middle ground, thank you for that.

It would be nice to get [alternatives] discussed, if we want to go this way I think I prefer the introduction of a keyword based syntax just because it becomes possible to check bindings are valid at compile time. Again, when we ship our product we want to be able to validate expressions as much as we can.

I’m very much in favor of discussing alternatives. I have actually started prototyping James’ proposal and I’m not particularly pleased with the result to be honest. If that is the consensus we land on, however, I will be more than happy to oblige and abandon JEP-11 for good.

@eddycharly
Copy link
Collaborator Author

[*].employee_id
   .let(
     {i: @}, &{
       employee_id: i,
       first_name: first_name,
       last_name: last_name,
       number_of_employees: $[?manager_id == i]|length(@)
     })
     |[?number_of_employees!=`0`]

How does this finds first_name and last_name ?

@eddycharly
Copy link
Collaborator Author

eddycharly commented Mar 22, 2023

[*].with_scope(@, &{
  employee_id: employee_id,
  first_name: first_name,
  last_name: last_name,
  number_of_employees: length($[?manager_id == use_scope(&employee_id)])
}) | [?number_of_employees != `0`]

I'm not sure we need lexical scope here though. Isn't the expression below correct ?

[*].{
  employee_id: employee_id,
  first_name: first_name,
  last_name: last_name,
  number_of_employees: length($[?manager_id == employee_id])
} | [?number_of_employees != `0`]

@springcomp
Copy link
Contributor

springcomp commented Mar 22, 2023

I'm not sure we need lexical scope here though. Isn't the expression below correct ?

[*].{
  employee_id: employee_id,
  first_name: first_name,
  last_name: last_name,
  number_of_employees: length($[?manager_id == employee_id])
}

Hum, not quite.

image

By the way, the expression I shared above works 100% if that’s your question 😁.

I have actually tested it using both the Python implementation – using jpterm as well as the .Net implementation.

The .Net implementation is also surfaced on the JMESPath website’s preview page and you can try it there to convince yourselves.

@eddycharly
Copy link
Collaborator Author

eddycharly commented Mar 22, 2023

Hum, not quite.

Why ? Ok, the filter expression needs the parent of course.

@springcomp
Copy link
Contributor

springcomp commented Mar 22, 2023

By the way, to be on equal footing when discussing with James, we cannot use the $ root-node reference as it does not exist from James’ viewpoint.

@eddycharly
Copy link
Collaborator Author

@springcomp with your solution:

[
  { "employee_id": 4529, "first_name": null, "last_name": null, "number_of_employees": 2 },
  { "employee_id": 4329, "first_name": null, "last_name": null, "number_of_employees": 3 },
  { "employee_id": 4125, "first_name": null, "last_name": null, "number_of_employees": 2 },
  { "employee_id": 4952, "first_name": null, "last_name": null, "number_of_employees": 2 }
]

first_name and last_name are not there.

@eddycharly
Copy link
Collaborator Author

By the way, to be on equal footing when discussing with James, we cannot use the $ root-node reference as it does not exist from James’ viewpoint.

Using $ seems unfair, I feel like it goes against the purpose of this feature, I can easily think about cases where using $ can be used as a workaround but will quickly make things very complicated and unmaintainable.

@springcomp
Copy link
Contributor

springcomp commented Mar 22, 2023

first_name and last_name are not there.

😯 I’m a bit embarrassed as I was focusing on the count.
Let me come up with a better expression :

[*].let( {i:@.employee_id}, &{
  employee_id: i,
  first_name: first_name,
  last_name: last_name,
  number_of_employees: $[?manager_id == i]|length(@)
}) 
|[?number_of_employees!=`0`]

Indeed $ can always be replaced with yet another layer of let({root: @}, &…) to wrap the original object in the outermost scope anyway. That is why I think it is critical we have chained scopes.

@eddycharly
Copy link
Collaborator Author

eddycharly commented Mar 22, 2023

@springcomp for completness here is an expression not using $:

// capture the whole list in scope
with_scope(@, &
  // compose a new scope containing the whole list (taken from parent scope) and the current element
  [*].with_scope({ list: use_scope(&@), current: @ }, &{
    employee_id: employee_id,
    first_name: first_name,
    last_name: last_name,
    number_of_employees: use_scope(&list) | [?manager_id == use_scope(&current.employee_id)] | length(@)
  })
) | [?number_of_employees != `0`]

@springcomp
Copy link
Contributor

springcomp commented Mar 22, 2023

@springcomp for completeness here is an expression not using $

Thank you very much. I now have a deeper understanding for your proposal.
Well, I must admit you are right. You do not need chained scopes in the language.

However, you are effectively creating the chain yourselves in your expression.

But +1 for being stubborn and proving me wrong. 🙊

@eddycharly
Copy link
Collaborator Author

eddycharly commented Mar 22, 2023

Well, I must admit you are right. You do not need chained scopes in the language.

We don't need chained scopes, but it doesn't mean we don't want them ;)
My proposal is very explicit, original JEP-11 is not explicit at all, and the @jamesls new proposal is more explicit than JEP-11 but less than my proposal.

Relaxing on being explicit is less verbose and in this sense I like the @jamesls new proposal because while being less explicit and potentially weak, it is still easily verifyable.

@eddycharly
Copy link
Collaborator Author

eddycharly commented Mar 22, 2023

To compare with @jamesls new proposal:

let $list = @ in
  [*].let $current = @ in
    {
      employee_id: employee_id,
      first_name: first_name,
      last_name: last_name,
      number_of_employees: $list | [?manager_id == $current.employee_id] | length(@)
    }

If I make a typo, with this design it's easy to detect it at compile time:

let $list = @ in
  [*].let $current = @ in
    {
      employee_id: employee_id,
      first_name: first_name,
      last_name: last_name,
      // $does_not_exist does not exist and we can easily find it at compile time
      number_of_employees: $does_not_exist | [?manager_id == $current.employee_id] | length(@)
    }

It can look like a detail but from an end user perspective this makes a real difference, I can now warn users of my product that the expression they wrote is incorrect.

@eddycharly
Copy link
Collaborator Author

eddycharly commented Mar 22, 2023

But what I like about isolated scope is that it's extremely simple to know what is available in the current scope, you only need to look at the enclosing with_scope function, no need to look at the parents, the entire scope inside a with_scope call is completely defined in a single place.

@eddycharly
Copy link
Collaborator Author

@springcomp are we generating the lexer/parser code using a tool ? or is this done by hand ?

@springcomp
Copy link
Contributor

springcomp commented Mar 23, 2023

Yes this is simple top-down parser hand-coded I’m afraid.
By the way @jamesls responded so I think we can hold re-introducing JEP-11.

Let’s proceed discussing his proposal.
Exciting times!

@springcomp
Copy link
Contributor

Here is how I would do it.

@eddycharly
Copy link
Collaborator Author

Thanks, will look into it tonight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants