Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add structured annotations to P4 grammar. Issue #789 #796

Merged
merged 10 commits into from
Apr 6, 2020
Merged

Add structured annotations to P4 grammar. Issue #789 #796

merged 10 commits into from
Apr 6, 2020

Conversation

chrispsommers
Copy link
Contributor

No description provided.

Copy link
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

@stefanheule
Copy link

I have a few questions:

  • It appears that string literals are allowed, but integer expressions as long as they evaluate to a constant at compile time. Why the discrepancy?
  • Are integers infinite-precision? Or is there a bit width?
  • Would it make sense to allow boolean constants?
  • It seems that multiple annotations with the same name accumulate their contents, which allows the mixing of expressions and key-value pairs. However, you can't mix them in the same annotation. That seems odd, why is that?
  • What happens if a key-value list has the same key several times?

Probably some of these questions have multiple reasonable answers, but they should probably all be answered before integrating this into the spec.

@jafingerhut
Copy link
Collaborator

@stefanheule In response to your question: "It appears that string literals are allowed, but integer expressions as long as they evaluate to a constant at compile time. Why the discrepancy?"

Is your question "Why aren't expressions whose results is a string allowed, as well as expressions whose result is an integer?" If so, I believe one answer would be: "P4 has no expressions whose result is a string, except string literals". i.e. no string concatenation operations, substrings, etc. So one way to think of it is that as far as P4_16 is defined now, it does allow arbitrary expressions whose result is a string, but those currently always look like string literals.

@stefanheule
Copy link

Makes sense, I didn't think about that. I would prefer a phrasing that treats all constants the same, even if right now there is no functional difference. But obviously this is a very minor point.

@jafingerhut
Copy link
Collaborator

It does seem reasonable to say that there could be arbitrary expressions that at compile time can be evaluated, and result in one of the acceptable types, e.g. if boolean is added to the list, then the result of the expression could be any value of type string, int, or boolean.

@chrispsommers
Copy link
Contributor Author

@stefanheule @jafingerhut Thanks for the comments, I agree any allowable types should require the same requirement for compile-time evaluation.

@antoninbas
Copy link
Member

@stefanheule regarding integer precision, I'll copy here what I posted to the p4-api mailing list

Regarding one topic we discussed at the meeting: support for infinite-precision integers as values in the body of structured annotations. I think one thing we could do is specify in the P4Runtime spec (and not the P4 spec) that P4Info only supports signed 64-bit integers. When generating P4Info, if we encounter a value that doesn't it within an int64, we can print an error message inviting the user to switch to a string.

@chrispsommers
Copy link
Contributor Author

@stefanheule Thanks for the insightful comments and questions.

It appears that string literals are allowed, but integer expressions as long as they evaluate to a constant at compile time. Why the discrepancy?

Agreed, all types should be evaluate to compile-time constants and I should update the wording (but bear in mind P4 treats strings very simply at this point in time).

Are integers infinite-precision? Or is there a bit width?

As @antoninbas said, the language should allow infinite-precision but P4Runtime or any other consumer may impose restrictions e.g. 64-bit.

Would it make sense to allow boolean constants?

That seems to be the popular consensus, I'll revise accordingly

It seems that multiple annotations with the same name accumulate their contents, which allows the mixing of expressions and key-value pairs. However, you can't mix them in the same annotation. That seems odd, why is that?

I was trying to keep it simple for the language and grammar. I'll leave this as-is and see how it goes in the LDWG with the other changes/clarifications. I'm not sure many use-cases need free mixing of the kv's and expressions, but let's see what others think. One of the suggestions in the API WG was to keep it simple and sufficient for known use-cases.

What happens if a key-value list has the same key several times?

Should be an error, I'll clarify that.

@mihaibudiu
Copy link
Contributor

We proposed removing from the spec the request for annotations to be merged.

@liujed
Copy link
Member

liujed commented Dec 16, 2019

During today's discussion, there was a question of whether it's useful to disambiguate empty lists, empty kv-lists, and empty unstructured annotations for annotation consumers.

@stefanheule
Copy link

Another thing that was suggested was to allow P4 lists as a value in addition to int, bool and string.

- No merging of multiple annotations
- No mixing of kvPair and expressionLists
- Allow empty expressionLists; empty body is disambiguated to empty list
- kvPair value may be expressionList, including empty list
- Cite illegal cases
- Provide more exammples.
- kvPairs have a value type of expression instead of expressionList; note expression itself can contain lists inside {}.
-  No valueless keys
- Clarified descriptions
- Updated examples, removed section headings from them
@stefanheule
Copy link

Right now the spec says

All expressions (and nested expressions) within a stucturedAnnotationBody
must evaluate at compile-time to string literals, infinite-precision integers or
booleans.

It is not clear what kind of nested expressions are allowed, and it is also not clear if nested expressions can themselves be nested.

Is {1,2,{1,2,{1,2}}} okay? Is {a=1,b=2} a valid "compile-time constant" expression?

Another question I had was if there is any kind of type checking? Can list elements be of different types?

@chrispsommers
Copy link
Contributor Author

@stefanheule Please see answers to your questions below. Do you still feel the spec requires clarification?

It is not clear what kind of nested expressions are allowed, and it is also not clear if nested expressions can themselves be nested.

This is actually specified quite precisely in the grammar portion of the spec. An expression can be of the form {expression, expression..} i.e. a nested expression.

Is {1,2,{1,2,{1,2}}} okay?

Yes! (I added an example to p4lang/p4runtime#265 which prompted this discussion.)

Is {a=1,b=2} a valid "compile-time constant" expression?

Not according to the grammar for an expression. This looks more like an assignment statement.

Another question I had was if there is any kind of type checking? Can list elements be of different types?

According to the grammar, each element in an expression list can be a different type.

Bear in mind P4Runtime will impose its own constraints on Structured Annotations and it is a policy to not specify those constraints in the P4 language spec itself. So you could write something in "legal p4-16" which would throw an error in the P4Info generator (generating P4Info is a p4c command-line option so the program can still be "compiled"). For example, an expression list containing nested kvPairs would be OK in P4-16 but not for P4Info.

@stefanheule
Copy link

This is actually specified quite precisely in the grammar portion of the spec. An expression can be of the form {expression, expression..} i.e. a nested expression.

The spec does not say what is and is not a compile-time constant, this needs to explicitly defined in my opinion. Compile-time constant probably makes sense for base types like int, but it less clear for nested types. Can we just explicitly list and say what is allowed?

To give a few examples of why this is necessary:

  • I asked if {a=1,b=2} is a valid compile-time constant (it's a kvPair expression), and you said no, but it's not clear to me why not. It is an expression (in the expression grammar: '{' kvList '}'), and it only has constants.
  • What about a[3] if a at compile time is known to be a fixed array?
  • What about {1,2}[1]? I think this is legal P4, at least according to the grammar.

Bear in mind P4Runtime will impose its own constraints on Structured Annotations and it is a policy to not specify those constraints in the P4 language spec itself. So you could write something in "legal p4-16" which would throw an error in the P4Info generator (generating P4Info is a p4c command-line option so the program can still be "compiled"). For example, an expression list containing nested kvPairs would be OK in P4-16 but not for P4Info.

I don't think this is a great idea. If we want to restrict what is and isn't valid, lets be clear about that and put it in the spec. There can be exceptions, of course, like requiring ints to fit in a certain bit-width, but we should not have a completely different set of expressions be valid in P4 and P4Info. What's the advantage of allowing nested kvPairs in the spec but not P4Info? Isn't P4Info the main use-case for this?

Finally, a last question: Are we going to be able to implement this spec easily in the compiler? Does the compiler have code to constant-fold nested expressions?

@jnfoster
Copy link
Collaborator

@stefanheule

Here is the relevant section of the specification. Happy to see it improved if you see things that are missing or confusing.

@stefanheule
Copy link

Ah, thank you Nate, I missed that. That answers most of my concerns then, but maybe it needs to be improved a bit. Is {a=1,b=2} a compile-time constant according to that definition? It seems no, but it also seems like const a = {a=1,b=2} makes a a compile-time constant? It would be good to have clarity on that.

And maybe Chris can comment on why there is a discrepancy in the allowed expressions in P4 vs P4Info if there still is any (other than max bit width).

@chrispsommers
Copy link
Contributor Author

@stefanheule Regarding your comment:

I don't think this is a great idea. If we want to restrict what is and isn't valid, lets be clear about that and put it in the spec. There can be exceptions, of course, like requiring ints to fit in a certain bit-width, but we should not have a completely different set of expressions be valid in P4 and P4Info. What's the advantage of allowing nested kvPairs in the spec but not P4Info? Isn't P4Info the main use-case for this?

Your objection to having the language spec allow more complexity of expressions than P4Runtime is understandable. In fact, at one time I'd added content to p4-16 spec which was more specifically-tailored to P4Info on an earlier version of structured annotations: a special case of kv-pairs inside (). I had the same inclination to state it all in the language spec. But the language experts on the WG wanted the spec to remain more "pure" from P4-Runtime and it was removed (later this became moot because we removed all structure from regular annotations). So I'm following the "party line" here as I interpret it. Feel free to make your case otherwise, it's an open forum.

And maybe Chris can comment on why there is a discrepancy in the allowed expressions in P4 vs P4Info if there still is any (other than max bit width).

It's a convenience that structured annotations are capitalizing on existing grammar for expressions, and thereby also benefits from the implementation which parses, checks and reduces expressions. This doesn't mean we have to support every permutation for P4Info.

There was a sentiment expressed in some earlier P4-API WG meetings to start out with simple support of structured annotations for known use-cases. There was concern that making it too complex would only delay implementation and multiply the work enormously (imagine all the test-cases we'd have to create). Since nobody made a case for arbitrarily-nested kv-lists of kv-lists of expression-lists (a contrived example), those cases aren't supported. We just stuck with something simple and immediately useful.

@stefanheule
Copy link

To me that sentiment extends to the P4 spec as well, though. Otherwise, if you take the P4Info doesn't have to support everything that P4 supports to the extreme, then there is no need for a new kind of annotations at all, and you can just use the new unconstrained annotations, but have P4Info only support what further parses into whatever format we want to support.

If we want to disallow nested kv-lists (which seems like a reasonable idea), then we should say so in the P4 spec. If we want to allow it, we should allow it in P4Info, too.

Copy link

@stefanheule stefanheule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than my small suggestions looks great, thank you.

@chrispsommers
Copy link
Contributor Author

chrispsommers commented Mar 11, 2020

@liujed Regarding #796 (comment) and the following text in the PR:

To avoid ambiguity, an empty structuredAnnotationBody is by definition
considered to contain an empty expressionList.

The P4-API WG is proposing we remove this statement from the spec. The current PR p4lang/p4c#2227 doesn't really do this (empy anno body results in empty kvLists and expressionLists in the IR) and P4Info would be burdened with generating an empty expressionList to remain consistent with spec clause above.

Any objections/concerns from a language/grammar point of view?

p4-16/spec/P4-16-spec.mdk Outdated Show resolved Hide resolved
p4-16/spec/P4-16-spec.mdk Outdated Show resolved Hide resolved
p4-16/spec/P4-16-spec.mdk Outdated Show resolved Hide resolved
@liujed-intel
Copy link

@liujed Regarding #796 (comment) and the following text in the PR:

To avoid ambiguity, an empty structuredAnnotationBody is by definition
considered to contain an empty expressionList.

The P4-API WG is proposing we remove this statement from the spec. The current PR p4lang/p4c#2227 doesn't really do this (empy anno body results in empty kvLists and expressionLists in the IR) and P4Info would be burdened with generating an empty expressionList to remain consistent with spec clause above.

Any objections/concerns from a language/grammar point of view?

This seems fine to me. I can't immediately think of any cases where it's worth distinguishing between the two kinds of empty lists.

@liujed-intel
Copy link

liujed-intel commented Mar 11, 2020

What happens if we have something like:

@my_anno("foo")
@my_anno["bar"]
table t { ... }

This appears permitted by the spec, since it's not explicitly forbidden. Seems odd to permit this, while forbidding multiple structured annotations with the same name.

@chrispsommers
Copy link
Contributor Author

chrispsommers commented Mar 12, 2020

@liujed That's an example of a duplicate error.

@liujed [UPDATE 3/13/20] My Bad! My eyes must have been blurry. I saw square brackets on both annotations. I'll follow up with separate comment below.

@mihaibudiu
Copy link
Contributor

The spec and the implementation do not make this clear.

@chrispsommers
Copy link
Contributor Author

chrispsommers commented Mar 12, 2020

@mbudiu-vmw I'm not sure what isn't clear; here is the spec text:

It is illegal to duplicate a key within the kvList of a structured
annotation. It is illegal to use the same annotation name on a given
annotationToken for structured annotations.

p4-16/spec/P4-16-spec.mdk Outdated Show resolved Hide resolved
@chrispsommers
Copy link
Contributor Author

chrispsommers commented Mar 13, 2020

Regarding #796 (comment) I made a mistake reading it.

@my_anno("foo")
@my_anno["bar"]
table t { ... }

This is legal because the two annotations are distinct types of annotations. @liujed wrote:

This appears permitted by the spec, since it's not explicitly forbidden. Seems odd to permit this, while forbidding multiple structured annotations with the same name.

There was a lot of discussion in the LDWG about what to do if we allowed duplicate uses of a structured annotation - do you merge the newer (parsed contents thereof) with the older, throw an error, or replace? We decided to disallow duplication as the simplest solution.

@liujed-intel
Copy link

liujed-intel commented Mar 13, 2020

I remember that discussion. But I don't think it touched on whether it made sense to mix structured and unstructured instances of an annotation. Naïvely, I'd expect an annotation to be structured or unstructured; having both seems to be a programmer error. Absent any strong use case for mixing the two kinds of annotations, I'd be inclined to do the conservative thing and forbid it.

@chrispsommers
Copy link
Contributor Author

@liujed-intel thanks for the comments. @mbudiu-vmw Can you comment on #796 (comment)? I'm OK with disallowing using the same name for unstructured and structured annotations. How does the implementation deal with this, or does it?

@mihaibudiu
Copy link
Contributor

I implemented both versions at some point. Currently we only reject structured duplicates.

this is more restrictive than previous but deemed better. Empty structured
annotations are not disambiguated to be empty expressionbLists, it seemed
unnecessary. Clarified some other language around structured annotations.
…d another example

to demonstrate illegal reuse of name in both structured and unstructured annotation.
@chrispsommers
Copy link
Contributor Author

I pushed commits to hopefully resolve the above conversations, thanks for all the detailed reviews and feedback. @liujed please review, @mbudiu-vmw this will require a change to p4c to disallow reusing an annotation name between structured and unstructured. Sorry for the reversal, it seems for the best.

;

kvPair
: name '=' expressionList
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be just name = expression
Because expression can be a list expression, which is { expressionList }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this should definitely be expression. We don't actually allow a list in this position.

@@ -7078,17 +7084,18 @@ annotations
annotation
: '@' name
| '@' name '(' annotationBody ')'
| '@' name '[' structuredAnnotationBody ']'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you can use the grammar from the compiler implementation ; it essentially inlines the structuredAnnotationBody in this rule.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this suggestion is taken, be sure to patch up those places where the text refers to structuredAnnotationBody.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of inlining here? I think I like that we have a name for structured annotation bodies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec grammar is almost identical to the bison grammar, because then we know it works. If we deviate it may actually not be implementable. Bison is very peculiar.

Copy link

@liujed-intel liujed-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to have lost the text that specified that a program element can't have multiple structured annotations with the same name.

Comment on lines 7069 to 7071
Structured annotations and unstructured annotations must not use the same
`name`. Thus, a given `name` can only be applied to one type of annotation or
the other.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it might forbid the following, but it's not clear whether that is the intent. We should clarify and ensure that the implementation matches.

@my_anno(1) table T { ... }
@my_anno[2] table U { ... }

@chrispsommers
Copy link
Contributor Author

chrispsommers commented Mar 26, 2020

@liujed-intel re: #796 (comment) thanks for pointing this out, it is an ambiguity. The spec should read:
Structured annotations and unstructured annotations on any one element must not use the same
name. Thus, a given name can only be applied to one type of annotation or
the other for any one element. An annotation used one one element does not affect the annotation on another because they have different scope.

This is legal:

@my_anno(1) table T { ... }
@my_anno[2] table U { ... }

This is illegal:

@my_anno(1)
@my_anno[2] table U { ... }

Does this make sense?

@chrispsommers
Copy link
Contributor Author

chrispsommers commented Mar 26, 2020

@liujed-intel re:

We seem to have lost the text that specified that a program element can't have multiple structured annotations with the same name.

I propose to add the following paragraph:

Multiple unstructured annotations using the same name can appear on a given
element; they are cumulative. Each one will be bound to that element. In
contrast, only one structured annotation using a given name may appear on an
element; multiple uses of the same name will produce an error.

@chrispsommers
Copy link
Contributor Author

@mbudiu-vmw @stefanheule re: #796 (comment):

The latest PR contains the following and has for awhile, unless I'm mistaken:

kvPair
    : name '=' expression
    ;

@chrispsommers
Copy link
Contributor Author

@mbudiu-vmw re: #796 (comment) I'm not confident interpreting this suggestion; this grammar has been in my PR for a while. Perhaps we can merge my PR and someone else can propose restating the grammar.

@liujed-intel
Copy link

Re: #796 (comment)

Yes, makes total sense, and this was also my thinking. But now that the question is being raised, I wonder whether it makes sense to do the stricter thing instead. At least, that would seem to be more consistent with the restriction that you can't mix structured and unstructured for the same name on a single program element. If you consider all annotations with the same name as expressions of the same concept, then I'd expect consistency with structured vs. unstructured in their usage. Thoughts?

@chrispsommers
Copy link
Contributor Author

chrispsommers commented Mar 26, 2020

@liujed-intel re: #796 (comment) I know what you mean, but I'd err on the side of letting the author write confusing code. It's no different than letting them apply the same unstructured annotation but imply a different meaning from place to place. Scoping ensures correct compilation but we can't always prevent people from writing bad code.

Copy link

@liujed-intel liujed-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. I'd let the question here simmer for a bit, in case there are other opinions out there.

I also share Mihai's concern about keeping the spec and implementation grammars in sync. I think it may be easier to just change the implementation grammar to match, rather than re-plumb the text in the spec.

@mihaibudiu mihaibudiu merged commit bce115c into p4lang:master Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants