-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding serializable enums (or enums with backing types and specified representations). #615
Changes from 4 commits
33ee2b4
34c7fc1
8b0a64c
83b1efa
4e90588
e5be6fa
8b9f2af
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,7 +5,7 @@ Author: The P4 Language Consortium | |
Heading depth: 5 | ||
Pdf Latex: xelatex | ||
Document Class: [11pt]article | ||
Package: [top=1in, bottom=1.25in, left=1in, right=1in]{geometry} | ||
Package: [top=1in, bottom=1.25in, left=1in, right=1in]geometry | ||
Package: fancyhdr | ||
|
||
Tex Header: | ||
|
@@ -2026,12 +2026,22 @@ An enumeration type is defined using the following syntax: | |
~ Begin P4Grammar | ||
enumDeclaration | ||
: optAnnotations ENUM name '{' identifierList '}' | ||
| optAnnotations ENUM typeRef name '{' specifiedIdentifierList '}' | ||
; | ||
|
||
identifierList | ||
: name | ||
| identifierList ',' name | ||
; | ||
|
||
specifiedIdentifierList | ||
: specifiedIdentifier | ||
| specifiedIdentifierList ', ' specifiedIdentifier | ||
; | ||
|
||
specifiedIdentifier | ||
: name '=' initializer | ||
; | ||
~ End P4Grammar | ||
|
||
For example, the declaration | ||
|
@@ -2043,10 +2053,55 @@ enum Suits { Clubs, Diamonds, Hearths, Spades } | |
introduces a new enumeration type, which contains four | ||
constants---e.g., `Suits.Clubs`. An `enum` declaration | ||
introduces a new identifier in the current scope for naming the | ||
created type. The underlying representation of such values is not | ||
created type. The underlying representation of the `Suits` enum is not | ||
specified, so their "size" in bits is not specified (it is | ||
target-specific). | ||
|
||
It is also possible to specify an `enum` with an underlying representation. | ||
This requires the programmer provide both the fixed-width integer type and an associated | ||
fixed-width integer value for each symbolic entry in the enumeration. For example, the | ||
declaration | ||
|
||
~ Begin P4Example | ||
enum bit<16> EtherType { | ||
VLAN = 0x8100, | ||
QINQ = 0x9100, | ||
MPLS = 0x8847, | ||
IPV4 = 0x0800, | ||
IPV6 = 0x86dd | ||
// ... | ||
} | ||
~ End P4Example | ||
|
||
introduces a new enumeration type, which contains five constants---e.g., | ||
`EtherType.IPV4`. This `enum` declaration specifies the fixed-width integer representation | ||
for each entry in the `enum` and provides an underlying type: `bit<16>`. | ||
This type of `enum` declaration can be thought of as declaring a new `bit<16>` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we allow signed enum values as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've used unsigned in all of the examples, but I don't think there is any particular reason to restrict it to this. Do you have an opinion? (I'm happy to add a case for it.) |
||
type, where variables or fields of this type are expected to be unsigned 16-bit | ||
integer values, and the mapping of symbolic to numeric values defined by the | ||
`enum` are effectively constants defined as a part of this type. In this way, | ||
an `enum` with an underlying type can be thought of as being a type derived | ||
from the underlying type carrying equality, assignment, and casts to/from the | ||
underlying type. | ||
|
||
Compiler implementations are expected to raise an error if the fixed-width integer representation | ||
for an enumeration entry falls outside the representation range of the underlying | ||
type. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like the issue of multiple entries having the same representation should be discussed somewhere in here. For example, whether this is allowed:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. Given the contents of the rest of the proposal, it seems that this should not be allowed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to disallow this |
||
|
||
For example, the declaration | ||
|
||
~ Begin P4Example | ||
enum bit<8> FailingExample { | ||
first = 1, | ||
second = 2, | ||
third = 3, | ||
unrepresentable = 300 | ||
} | ||
~ End P4Example | ||
|
||
would result in an error because the fixed-width integer value associated with | ||
`FailingExample.unrepresentable` cannot be represented as a `bit<8>` value. | ||
|
||
Annotations, represented by the non-terminal `optAnnotations`, are | ||
described in Section [#sec-annotations]. | ||
|
||
|
@@ -2278,14 +2333,17 @@ infinite-precision integer, without a width specified. | |
| `error` | error | error | allowed | | ||
| `match_kind` | error | error | error | | ||
| `bool` | error | error | allowed | | ||
| `enum` | error | error | allowed | | ||
| `enum` | allowed[^enum_header] | error | allowed | | ||
| `header` | error | allowed | allowed | | ||
| header stack | error | error | allowed | | ||
| `header_union` | error | error | allowed | | ||
| `struct` | error | error | allowed | | ||
| `tuple` | error | error | allowed | | ||
|----------------|-----------|---------|----------| | ||
|
||
[^enum_header]: an `enum` type used as a field in a `header` must specify a | ||
underlying type and representation for `enum` elements. | ||
|
||
Rationale: `int` does not have precise storage requirements, | ||
unlike `bit<>` or `int<>` types. `match_kind` | ||
values are not useful to store in a variable, as they | ||
|
@@ -2717,13 +2775,163 @@ X.v1 // reference to v1 | |
v1 // error - v1 is not in the top-level namespace | ||
~ End P4Example | ||
|
||
Similar to errors, `enum` expressions only support equality (`==`) | ||
and inequality (`!=`) comparisons. Expressions whose type is an `enum` | ||
Similar to errors, `enum` expressions without a specified underlying type only support equality (`==`) | ||
and inequality (`!=`) comparisons. Expressions whose type is an `enum` without a specified underlying type | ||
cannot be cast to or from any other type. | ||
|
||
Note that if an `enum` value appears in the control-plane API, the | ||
An `enum` may also specify an underlying type, such as the following: | ||
|
||
~ Begin P4Example | ||
enum bit<8> E { | ||
e1 = 0, | ||
e2 = 1, | ||
e3 = 2 | ||
} | ||
~ End P4Example | ||
|
||
Note that each symbolic value in the enum must map to a unique fixed-width integer | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did we decide to keep the "unique" requirement? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can case names be used while defining the enum? This is particularly useful if aliases are allowed: enum bit<32> Authors {
Clemens = 0,
Twain = Clemens // is Clemens defined here? is Authors.Clemens?
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not clear, a declaration is available only after it has been "completed". Not clear whether the Clemens declaration ends with the comma or with the closing brace. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd argue that since it's useful, it should be allowed. But there should be some language to say one way or another. |
||
representation, for example the `enum` in the following example would result in | ||
error: | ||
|
||
~ Begin P4Example | ||
enum bit<8> Bad { | ||
b1 = 0, | ||
b2 = 1, // Error: b2 and b3 both map to 1 | ||
b3 = 1, // one must be changed. | ||
b4 = 2 | ||
} | ||
~ End P4Example | ||
|
||
An `enum` with an underlying type also supports explicit casts to and from the | ||
underlying type. For instance, the following code: | ||
|
||
~ Begin P4Example | ||
bit<8> x; | ||
E a = E.e2; | ||
E b; | ||
|
||
x = (bit<8>) a; // sets x to 1 | ||
b = (E) x; // sets b to E.e2 | ||
~ End P4Example | ||
|
||
casts `a`, which was initialized to `E.e2` to a `bit<8>`, using the specified | ||
fixed-width integer representation for `E.e2`, `1`. The variable `b` is then set to the | ||
symbolic value `E.e2`, which corresponds to the fixed-width integer value `1`. | ||
|
||
Note that while it is always safe to cast from an `enum` to its fixed-width integer type, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe "safe" is not the right word here; with our current definition both casts are actually "safe", but the second one does not create a "named" value. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
it may not be the case that every fixed-width integer value specified by the underlying type | ||
has a related symbol in the `enum`. For instance, the following code: | ||
|
||
~ Begin P4Example | ||
bit<8> x = 5; | ||
E e = (E) x; // sets e to an unnamed value | ||
~ End P4Example | ||
|
||
sets `e` to an unnamed value, since there is no symbol corresponding to the | ||
fixed-width integer value `5`. While the value of `e` is unspecified, it must not | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would say here that the value of e is unnamed. It certainly is not unspecified, in fact it is 5. The phrase about "must not correspond" is no longer necessary. |
||
correspond to any symbol defined within enum `E`. This is because casting from | ||
a fixed-width integer type to an `enum` type is required to obey the following semantics: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know you have the later condition that (bit)(E)x == x for all possible I mention this simply because uninitialized values in P4_16 are, I believe, "more unknown" than the value of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What would you refer to this value as, if not unknown? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The phrase starting with "This is because" is probably unnecessary now. |
||
|
||
Given an enum type $E$ with underlying type `bit<W>` and the set of mappings | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are no longer requirements, they are consequences of the definition. They will always hold. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I removed all of the language around this. |
||
of $(s, n)$ where $S$ is the set of symbols ($s$) and $N$ is the set of fixed-width integer | ||
representations ($n$), the following properties must hold: | ||
|
||
1. if an integer $x \notin N$ then $\forall e \in S$, `(E)x != e` | ||
1. $\forall x \in [0, 2^W]$, `(bit<W>)(E)x == x` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don' t know why you want this property to hold. Who cares about it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean to say the property holds by definition of an enum with underlying type and doesn't need to be spelled out explicitly? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mbudiu-vmw If we are going to the trouble of specifying an underlying type that is More directly related to your question, if implementations guaranteed the property, it certainly makes explaining exactly how an enum with an underlying type works -- it is really a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If that is the model, then we should say so explicitly in the spec. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was requested in the issue #611 related to this and I basically pulled the text from there (with a lite-rewording & turning some text in math). Where do we stand on this? Personally, I don't feel strongly about it one way or the other. |
||
1. $\forall e \in S$, `(E)(bit<W>)e == e` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So if I roughly translate this into English, if an enum type with a backing type Is the comment in the last 'else' branch below correct, given this proposal?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's the idea. Although a more realistic scenario would probably look like this:
I think it's important to have the guarantee that if the etherType extracted from the packet is not one of All of the guarantees listed here are achieved by a "trivial" implementation where an enum with an underlying type is completely taken care of at compile time and doesn't exist at runtime. I think that's what makes the enum useful: it makes it more convenient to write programs and generate runtime APIs, and it doesn't have any packet-processing runtime cost. That being said I'm fine with introducing extern methods to safely cast enums or to check if a value is valid, as @jamescoole-cisco suggested, as long as they are optional when it comes to actually using enums in a P4 program. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should one (or both) of these examples be added as explanation to the math? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not obvious that the first example is unique to enums with backing types. If uninitialized enum variables are intended to be "undefined" only within valid entries of the enum, it might be good to state that explicitly. However, if that's the case, (compiler) implementations will already need to initialize non-backed enum variable storage in most cases, so why not require the same for enums with backing types? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jamescoole-cisco I actually wrote a change to the P4_16 language spec, committed after v1.0.0 was published, that says something about exactly this case, but had forgotten I wrote that part. See #611 (comment) for details. Unless that text is changed further in the spec, P4_16 compilers are already free to make uninitialized enum and error values unequal to any of their named members. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given my previous comment, I am not against having example code like these in the spec, but if so, they should be clear that the examples apply for enum's with or without underlying types. I think having no code examples would also be fine, but adding a sentence that mentions that all values with type enum can be not-equal to any of their named members, with a cross-reference to the section "## Operations on headers { #sec-ops-on-hdrs }" (Madoko syntax copied from the .mdk file) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree it would be good to copy or move the discussion of how undefined both types of enums can be into section, since it applies in situations besides header fields. But IDK if the example is necessary to get the point across. |
||
|
||
For example, in the following code, the `else` clause of the `if/else if/else` | ||
block can be reached even though the matches on `x` are complete with respect | ||
to the symbols defined in `MyPartialEnum_t`: | ||
|
||
~ Begin P4Example | ||
enum bit<2> MyPartialEnum_t { | ||
VALUE_A = 2w0, | ||
VALUE_B = 2w1, | ||
VALUE_C = 2w2 | ||
} | ||
|
||
bit<2> y = < some value >; | ||
MyPartialEnum_t x = (MyPartialEnum_t)y; | ||
|
||
if (x == MyPartialEnum_t.VALUE_A) { | ||
// some code here | ||
} else if (x == MyPartialEnum_t.VALUE_B) { | ||
// some code here | ||
} else if (x == MyPartialEnum_t.VALUE_C) { | ||
// some code here | ||
} else { | ||
// A P4 compiler MUST ASSUME that this branch can be executed | ||
// some code here | ||
} | ||
~ End P4Example | ||
|
||
Additionally, if an enumeration is used as a field of a header, we would expect | ||
the `transition select` to match `default` when the parsed integer value does | ||
not match one of the symbolic values of `EtherType` in the following example: | ||
|
||
~ Begin P4Example | ||
enum bit<16> EtherType { | ||
VLAN = 0x8100, | ||
IPV4 = 0x0800, | ||
IPV6 = 0x86dd | ||
} | ||
|
||
header ethernet { | ||
... | ||
EtherType etherType; | ||
... | ||
} | ||
|
||
parser my_parser(...) { | ||
state parse_ethernet { | ||
packet.extract(hdr.ethernet); | ||
transition select(hdr.ethernet.etherType) { | ||
EtherType.VLAN : parse_vlan; | ||
EtherType.IPV4 : parse_ipv4; | ||
EtherType.IPV6: parse_ipv6; | ||
default: reject; | ||
} | ||
} | ||
~ End P4Example | ||
|
||
Any `enum` that contains an unspecifed value, whether as the result of a cast | ||
to an `enum` with an underlying type, parse into the field of an `enum` with an | ||
underlying type, or simply the declaration of any `enum` without a specified | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this explanation can now be also simplified by reducing it to the case of bitstrings. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure what you mean by this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These should be changed to "unnamed" as well. This whole paragraph is really unnecessary.
|
||
initial value might not be equal to any of the values defined for that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually I think that "might" is too weak: it "will not" be equal for sure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the original intent of the sentence is that a variable of an enum type that contains at least one unnamed value might not be equal to any of the named values. We could strengthen the sentence to "When a variable of an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just changed "might" to "will", I think that is accurate. |
||
type. Such an unspecified value should still lead to predictable | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unspecified -> unnamed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
behavior in cases where any legal value would match, e.g. it should | ||
match in any of these situations: | ||
|
||
- If used in a `select` expression, it should match `default` or `_` | ||
in a key set expression. | ||
- If used as a key with `match_kind` `ternary` in a table, it should | ||
match a table entry where the field has all bit positions "don't | ||
care". | ||
- If used as a key with `match_kind` `lpm` in a table, it should match | ||
a table entry where the field has a prefix length of 0. | ||
|
||
Note that if an `enum` value lacking an underlying type appears in the control-plane API, the | ||
compiler must select a suitable serialization data type and | ||
representation. | ||
For `enum` values with an underlying type and representations, the compiler should | ||
use the specified underlying type as the serialization data type and | ||
representation. | ||
|
||
In addition to these operations a P4 implementation may choose to implement | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know if we need to put this into the spec; it can be in the architecture definition file of the targets which do implement such functions. Also, "is_valid" should be probably called "is_named". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I went ahead and removed these, I agree having them as part of an architecture description is probably right. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure why I cannot separately reply to the question about matching against raw constants, but I would say this:
|
||
externs for explicitly checking that a fixed-width integer value corresponds to | ||
a symbolic representation within an `enum` or provide a checked conversion | ||
extern. | ||
|
||
~ Begin P4Example | ||
extern bool is_valid<Enum, Int>(Int x); | ||
extern Enum convert<Enum, Int>(Int x, Enum default); | ||
~ End P4Example | ||
|
||
The `is_valid` extern returns `true` for integer values that correspond to | ||
symbols in the supplied `Enum` type parameter. The `convert` function returns | ||
either the symbolic value associated with `x` in `Enum` when one exists, or the | ||
provided `default`. | ||
|
||
## Expressions on Booleans { #sec-bool-exprs } | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make this even more specific:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You noted this here, but then noted later that the
typeRef
here worked, do you have a preference?