Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust grammar to distinguish paranthesis and brace based initialization in C++ #399

Merged
merged 17 commits into from
Jul 13, 2021

Conversation

cmnrd
Copy link
Collaborator

@cmnrd cmnrd commented Jul 8, 2021

Essentially, this MR extends the grammar to allow {...} for initializing state variables and parameters as an alternative to the current (...) syntax. The validator ensures that this feature is only usable in the C++ target. See below for a more detailed explanation.

There is a semantic difference in C++ between brace based initialization {...} and parenthesis based initialization (...). Generally speaking, the curly braces should be preferred whenever possible. I don't want to go into the details of this, but if you are interested, here is a discussion on stackoverflow. Now in my opinion C++ has messed up quite a bit, because it is not at all clear when it is possible to use the curly braces for initialization. The current policy in the LF C++ code generator is to always use the curly braces for initialization. This works well for simple examples, but causes problems in more complex scenarios, as curly braces can be ambiguous in C++ code...

Let me make an example to illustrate the problem. Consider this state variable:

state foo: std::vector<int>(4,2);

This variable will be initialized in the constructor of the generated reactor. In C++ we have two options for this initialization: foo(4,2) or foo{4,2}. Both will do very different things! foo(4,2) calls the vector constructor 3) which "3) Constructs the container with count copies of elements with value value". So the resulting vector will be initialized to [2, 2, 2, 2]. However, if we use the curly braces, a different constructor will be called as the braces are interpreted as an initializer list: "10) Constructs the container with the contents of the initializer list init.". Thus, the resulting vector will be set to [4, 2]. This is equivalent to calling foo({4, 2}).

Since the current policy in the LF C++ code generator is to always use curly braces, it is impossible to call a std::vector constructor other than the initializer list one. This is also a problem for many other classes. Now we could switch policies and always generate parenthesis based initializers. However, the curly braces have significant advantages over parenthesis, and it would be nice if we are able to use them where it is appropriate. Therefore, this MR introduces the {...} syntax and gives the LF programmer full control over the initialization method to be used in the generated code.

@cmnrd cmnrd requested review from Soroosh129, edwardalee and lhstrh July 8, 2021 09:59
Copy link
Collaborator

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I wonder if we should also allow initializers to be arbitrary identifiers? In C, we sometimes have an awkwardness like this:

    state primes:int*({= NULL =});

or

    state last_invoked:tag_t({= NEVER_TAG =});

@Soroosh129
Copy link
Contributor

Soroosh129 commented Jul 9, 2021

This is a very interesting point.

In Python, { }, ( ) and [ ] all have different meanings (sets/dictionary, tuples, and lists respectively). But of course, something like [4] (a list with only one element of 4) can be somewhat ambiguous (e.g., input inp[4], note the absence of a type which is forbidden in the Python target).

Also, tuples are immutable but lists are mutable, so, parameter lists (initialized in LF using ()) are currently initialized as Python tuples (( )) and state variable lists are initialized as Python lists ([ ]). Opening it up to the programmer can be confusing, perhaps. Maybe the Python target should also only support { } and ( ) and interpret the latter as a list or a tuple depending on the situation.

@cmnrd
Copy link
Collaborator Author

cmnrd commented Jul 9, 2021

I don't think that there is the same problem in Python. In Python, initialization is always done with =, right? So state foo(4) would be translated to foo = 4. For more complex things, we could simply write state foo({= {"foo": 3, "bar": 42} =}) or state foo({= [1, 2, 3, 4] =}). This should still be handled correctly by the code generator. The ambiguity in the C++ example above does not stem from the fact that there are multiple ways to represent containers, but that there are multiple ways to initialize variables and it is not clear when to use which initialization mechanism. By writing state foo(1, 2, 3, 4) or state foo{1, 2, 3, 4} we can let the programmer decide which initialization mechanism to use. This, however, has no influence on the datatype.

Generally, I think it would be beneficial to support a more complex syntax for initialization values, so that we can write state foo:int*(NULL) or state foo([1, 2, 3, 4]), but it should probably be part of another merge request.

@Soroosh129
Copy link
Contributor

Soroosh129 commented Jul 10, 2021

In Python, initialization is always done with =, right?

Right. My point was: what should we do about state foo(4,5,6) in the Python target? I have made a conscious choice of interpreting this as a Python list ([4, 5, 6]), which can be confusing to a Python programmer.

It should probably be part of another merge request.

At least for the Python target, I agree that this should be addressed in another PR.

@cmnrd
Copy link
Collaborator Author

cmnrd commented Jul 12, 2021

Perhaps state foo(4,5,6) should be translated to foo = 4, 5, 6 in Python which would build a tuple. At least this is what I would expect to get when writing this LF code. Then the programmer can explicitly build lists using state foo({=[1,2,3]=}) or tuples of lists state foo({=[]=}, {=[1]=}, {=[1,2=}) (maybe we can extend the grammar, so that [] can be used without escaping).

I wouldn't overload the meaning of our initialization operator () to distinguish different target types. We already have a type annotation syntax if that is needed.

@cmnrd cmnrd mentioned this pull request Jul 12, 2021
24 tasks
@edwardalee
Copy link
Collaborator

While I agree it should be a separate pull request, I would like to see our grammar be much more permissive about the initialization to eliminate the escaping with {= ... =} as much as possible.

@cmnrd
Copy link
Collaborator Author

cmnrd commented Jul 13, 2021

I documented this new feature in the C++ article in our wiki.

@cmnrd cmnrd merged commit 6681258 into master Jul 13, 2021
@cmnrd cmnrd deleted the cpp-types branch August 20, 2021 05:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants