Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python list initializers & equal initializer syntax #544

Closed
wants to merge 81 commits into from

Conversation

oowekyala
Copy link
Collaborator

@oowekyala oowekyala commented Sep 27, 2021

Implement list literals for Python (see discussion #492, issue #507). Now:

  • using (1, 2, 3) as an initializer creates a tuple (previously created a list)
  • using ([1, 2, 3]) as an initializer creates a list

This makes LF syntax in line with python syntax itself and allows writing

  • initializers with an empty list ([])
  • initializers with a one element list ([1])

Previously these scenarios required fat braces, eg {=[]=}.

Using list literals in other targets than Python produces a validator error. When this is merged I'll follow up with the necessary changes for Rust to support this. TS could also support it.

The python code generator is simplified as there is no more special treatment for lists in parameters. Previously, initializing a parameter with (1, 2) created a tuple, while the same initializer for a state variable creates a list. I argue that this was hard to learn and not ergonomic, because

  • the difference is not obvious to a user reading the code
  • it breaks a useful refactoring flow, that abstracts a state variable initializer to a parameter. Namely, when you have the following:
reactor R {
   state var(...)
}

you can easily abstract it by turning it into

reactor R(p(...)) {
   state var(p)
}

and the code means the same, regardless of what "..." is. Previously, this was not true for python if ... is a comma separated list, because the refactoring effectively makes the state variable be initialized to an immutable tuple instead of a mutable list, which may cause unexpected errors at runtime.

I understand that this was supposed to honor the principle that parameters should be immutable, whereas state variables are not. However I think as shown above, this corner case has potent downsides, and doesn't buy us safety anyway: you could always initialize your parameter to a mutable thing, eg {=[1,2]=}, put mutable objects inside the tuple fields, or even set the parameter variable itself (I think), as Python doesn't support final bindings.



Last things to do

  • Address the naming problem of productions
  • Introduce syntax for array = { 1,2 } in C/C++
  • Mandate using = syntax in constructor calls

@Soroosh129
Copy link
Contributor

Soroosh129 commented Sep 27, 2021

Thanks for taking the time to simplify the Python generator. I like the simplified logic of the generator itself. However, I have the following concerns:

  • As you correctly pointed out, the () initializer was interpreted as an immutable tuple for parameters and a mutable list for state variables to prevent serious error conditions that can occur when a parameter is changed in the body of the reaction, or when a reaction attempts to change a state variable, but fails (with a syntax error). I agree that the corner case of assigning parameters to a state variable as an initial value can be problematic. If this corner case is a concern, we can use the list() constructor to convert a tuple parameter to a list when it is initially assigned to a state variable. However, I can see why this would be considered unergonomic to a Python programmer, which brings up my next point.

  • I think that state a([1,2,3]) is very unintuitive. With the updated syntax, defining a state s(1,2,3) will create an immutable tuple, which will produce a syntax error if an attempt is made to modify it. I would say the next natural thing to be tried by the user (assuming that they know Python) is state s[1,2,3], which will fail with a cryptic error. I don't think that it will ever occur to someone, unless they read the instructions carefully somewhere, that state variable array-types should almost always be defined as state a([1,2,3]), especially since the syntax for tuples is not state a((1,2,3)). I would say to be consistent, the syntax for defining array-type state variables should be one of the following:

    1. state a((1,2,3)) for tuple and state a([1,2,3]) for list
    2. state a(1,2,3) for tuple and state a[1,2,3] for list
    3. state a({=(1,2,3)=}) for tuple and state a(1,2,3) for lists (this one is not so much consistent but is less error-prone than the proposal IMO).

    My inclination would be more toward 2 or 3. With option 2, state a(1,2,3) should produce a warning. Similarly, using a[1,2,3] for a parameter should also produce a warning.

Finally, I suggest that the following two tasks must also be dealt with:

  • Update Writing-Reactors-in-Python to reflect the new syntax.
  • Make sure that examples, experimental examples, and benchmarks all use the new syntax.

@oowekyala
Copy link
Collaborator Author

Thanks for your review! I'll get busy with the tasks you outlined.

I don't find ([a,b,c]) unintuitive, as I understand it as "parens around an expression", where the expression is here the [a,b,c] list. In case of (a,b,c), the expression is an unparenthesized tuple (which is perfectly valid python). In case of (a), it's "just a". It's regular and easy to explain to a user.

The state a[1,2,3] syntax (proposition ii) would cause parsing ambiguity in languages that use types... Eg in state a:int[2], do the brackets denote the array type or the initializer?

I would be fine with your proposition (i), but IMO a more satisfactory candidate would be to stop using parentheses to denote assignment, which to me is the unintuitive thing here. If we replace them with an = sign, we get the following pythong:

  • state a = [a, b] for a list
  • state a = (a, b) for a tuple
  • state a = 1 for an integer
  • state a = {= ... =} for something weird
  • ...

The general grammar is just state a = expr for any expr. Then, what exactly an "expr" is can be restricted to expression forms that are close to the target language. Eg as per #492, it's not very useful to support list or tuple syntax in C. The equal sign syntax also plays well with typed languages, as indeed it's the syntax of many typed languages like MLs, Kotlin, Rust, (and also Python's type annotations). So I think it's the most intuitive syntax around. Lastly this syntax already exists in LF, as it's the one of constructor parameter assignments. The syntax state x: t(a, b) can mean "constructor call on type t" for C++ and other languages that have constructors.

If that is not an option, then I'd prefer your proposal (i) or my proposal.

@cmnrd
Copy link
Collaborator

cmnrd commented Sep 28, 2021

I agree with @oowekyala that our current syntax can be described as state name: type(expr), or state name: type(expr, expr, expr, ...) for C++ constructors. In this regard, I find it intuitive (even as a Python programmer), to write state foo([1, 2, 3]). The parenthesis simply enclose the expression that is to be assigned. I think that this provides the best clarity. Using state foo[3] would be very confusing, as we use the [N] syntax to denote fixed size lists.

@oowekyala might be right about our (expr) syntax being unintuitive (at least for some languages). I would be fine with extending the grammar so that we can either write state name(expr) or state name = expr. Which option we use could be target specific. This would probably help to reduce the syntax clutter in some of our targets. However, I am not sure how the assignment syntax would play along with our type annotations. state name: type = expr looks a bit weird to me.

@Soroosh129
Copy link
Contributor

I also agree with @oowekyala that our current assignment syntax could use a bit of a boost. I think the end result of having state a = [a, b] for a list and state a = (a, b) for a tuple will be intuitive and easy to understand. One downside is that people might automatically assume that the right-hand side can contain any arbitrary target language expression without knowing about {=...=}. We might be able to mitigate this by a clear syntax error that points to using {=...=} if possible, I'm not sure.

state name: type = expr looks a bit weird to me

How so? I think Swift, Scala, and Rust seem to be using this var name: type = expr type of assignment.

After reading your comments and trying out a few examples and patterns in Python, I am convinced that () and ([]) are intuitive enough, although just to be safe, can we allow both () and (()) (both will be equivalent)?

@cmnrd
Copy link
Collaborator

cmnrd commented Sep 28, 2021

How so? I think Swift, Scala, and Rust seem to be using this var name: type = expr type of assignment.

I guess it's just me then :). This syntax would definitely be more natural for Rust and Python programmers

Copy link
Collaborator

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Please be sure to update the wiki pages for the affected target languages so that this new syntax is documented.

@Soroosh129
Copy link
Contributor

@oowekyala Is this ready to merge?

Copy link
Collaborator

@cmnrd cmnrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good!

Should we open an issue for the assignment operator discussion? I think it's important to keep track of it.

@Soroosh129
Copy link
Contributor

Soroosh129 commented Oct 4, 2021

Please don't forget to update the wiki and the example and the benchmark before merging in.

@oowekyala
Copy link
Collaborator Author

I've just updated the wiki. I reorganised the language specification page a bit (added an appendix about types and one about expressions, to clean up the section about parameter declarations, which had the burden of explaining everything since it's the first section on the page). Feel free to change things.

I just realised i forgot to update the python page though...

@oowekyala
Copy link
Collaborator Author

Ok I think I'm done with the wiki, and AFAICS no more code needs to be updated

I'll merge this shortly and open the issue @cmnrd referred to

@Soroosh129
Copy link
Contributor

Ok I think I'm done with the wiki, and AFAICS no more code needs to be updated

I'll merge this shortly and open the issue @cmnrd referred to

Thank you for updating the wiki. There is a sentence in the Python entry that is left unfinished:

The initial value is given by the expression enclosed in parentheses. Many different

@oowekyala
Copy link
Collaborator Author

thanks for spotting this, it's fixed now!

@codecov
Copy link

codecov bot commented Oct 4, 2021

Codecov Report

Merging #544 (f08bd72) into master (9087aa8) will increase coverage by 0.07%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #544      +/-   ##
============================================
+ Coverage     66.43%   66.50%   +0.07%     
- Complexity     3466     3487      +21     
============================================
  Files           132      133       +1     
  Lines         22294    22325      +31     
  Branches       2876     2873       -3     
============================================
+ Hits          14810    14847      +37     
+ Misses         6345     6332      -13     
- Partials       1139     1146       +7     
Impacted Files Coverage Δ
org.lflang/src/org/lflang/Target.java 100.00% <100.00%> (ø)
...g.lflang/src-gen/org/lflang/lf/impl/ValueImpl.java 57.14% <0.00%> (-0.81%) ⬇️
org.lflang/src-gen/org/lflang/lf/LfPackage.java 100.00% <0.00%> (ø)
...rg.lflang/src-gen/org/lflang/lf/util/LfSwitch.java 0.00% <0.00%> (ø)
...g/src-gen/org/lflang/lf/util/LfAdapterFactory.java 0.00% <0.00%> (ø)
...ng/src-gen/org/lflang/lf/impl/ListLiteralImpl.java 26.92% <0.00%> (ø)
...lang/src-gen/org/lflang/lf/impl/LfPackageImpl.java 99.29% <0.00%> (+0.01%) ⬆️
...end-gen/org/lflang/validation/LFValidatorImpl.java 50.94% <0.00%> (+0.24%) ⬆️
org.lflang/xtend-gen/org/lflang/ASTUtils.java 76.17% <0.00%> (+0.37%) ⬆️
...lang/src-gen/org/lflang/lf/impl/LfFactoryImpl.java 67.52% <0.00%> (+0.51%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9087aa8...f08bd72. Read the comment docs.

Copy link
Member

@lhstrh lhstrh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about parameter assignments like foo = ([1, 2]).

@lhstrh
Copy link
Member

lhstrh commented Oct 4, 2021

I agree with @oowekyala that our current syntax can be described as state name: type(expr), or state name: type(expr, expr, expr, ...) for C++ constructors. In this regard, I find it intuitive (even as a Python programmer), to write state foo([1, 2, 3]). The parenthesis simply enclose the expression that is to be assigned. I think that this provides the best clarity. Using state foo[3] would be very confusing, as we use the [N] syntax to denote fixed size lists.

I agree with this, but this reasoning should not extend to parameter assignments which use a = instead of parentheses. In other words, I would like to make sure that new Foo(bar=([1, 2])) means that bar is assigned a tuple of size 1 that has a list of numbers in it. If this is not true (and we're effectively ignoring the outer parentheses, saying that the assignment is just a list of numbers) then I think we have a problem.

@oowekyala
Copy link
Collaborator Author

oowekyala commented Oct 4, 2021

I agree with this, but this reasoning should not extend to parameter assignments which use a = instead of parentheses. In other words, I would like to make sure that new Foo(bar=([1, 2])) means that bar is assigned a tuple of size 1 that has a list of numbers in it. If this is not true (and we're effectively ignoring the outer parentheses, saying that the assignment is just a list of numbers) then I think we have a problem.

This was always a problem in the parameter syntax. You could always add useless parentheses.

I think it's counter intuitive that (expr) would not be the same thing as expr. Parentheses don't change the meaning of an expression in any language I know, they just change operator precedence. In Python and Rust, to create a tuple of size 1, you need a trailing comma: (1,). This means in both languages (e) = e. and for types (T) = T.

So I'm fine with having duplicate syntax in parameter assignment, especially if we're phasing out using parentheses for assignment anyway. Also, parenthesized expressions could become a useful expression form if at some point we introduce syntax for infix operators.

But, we should possibly make sure that (1,) creates a tuple properly in Python and not an integer, which I think is currently not the case.

@lhstrh
Copy link
Member

lhstrh commented Oct 4, 2021

After some discussion with @Soroosh129, I'm convinced that this (and the current implementation of lists in LF) needs more discussion. I think it's better to hold off from merging this into master; I propose to table this until after the tutorial.

@oowekyala
Copy link
Collaborator Author

Things that are diverging from the spec (#986)

  • ctor calls allow the three assignment syntaxes and not just =
  • unclear if c++ codegen for ctor calls matches the spec
  • Python dictionaries are unsupported, though braced init lists are.

@lhstrh
Copy link
Member

lhstrh commented Sep 17, 2022

Now that you're back in action, @oowekyala, should we try to revive this effort? I feel like there were a lot of good contributions in this PR. @cmnrd addressed some of them, I believe, but there's still work in this branch that hasn't made it to master...

@cmnrd
Copy link
Collaborator

cmnrd commented Sep 19, 2022

Right. I already briefly spoke with @oowekyala about this. I also still have it on my TODO List to create an issue detailing several action items on the way forward. I'll try to take care of this soon.

@lhstrh lhstrh modified the milestones: 0.4.0, 0.5.0 Oct 11, 2022
@lhstrh
Copy link
Member

lhstrh commented Oct 11, 2022

Note that I silently moved this to release 0.5.0. Do you think, @oowekyala and @cmnrd, that you would have the bandwidth to address these lingering issues this fall? I realize this PR is very stale at this point, but the ideas that were implemented here I think are important and should find their way into master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants