Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft of proposal to allow emit and extract on arbitrary structs #736

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 85 additions & 30 deletions p4-16/spec/P4-16-spec.mdk
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ Assuming a fixed cost for table lookup operations and interactions
with extern objects, all P4 programs (i.e., parsers and controls)
execute a constant number of operations for each byte of an input
packet received and analyzed. Although parsers may contain loops,
provided some header is extracted on each cycle, the packet itself
provided some data is extracted on each cycle, the packet itself
provides a bound on the total execution of the parser. In other words,
under these assumptions, the computational complexity of a P4 program
is linear in the total size of all headers, and never depends on the
Expand Down Expand Up @@ -2564,7 +2564,7 @@ on a packet:

~ Begin P4Example
extern packet_out {
void emit<T>(in T hdr);
void emit<T>(in T x);
}
control d(packet_out b, in Hdr h) {
apply {
Expand Down Expand Up @@ -4877,7 +4877,7 @@ a `parser` instantiation.

~ Begin P4Example
extern packet_in {
void extract<T>(out T headerLvalue);
void extract<T>(out T Lvalue);
void extract<T>(out T variableSizeHeader, in bit<32> varFieldSizeBits);
T lookahead<T>();
bit<32> length(); // This method may be unavailable in some architectures
Expand All @@ -4888,13 +4888,15 @@ extern packet_in {
To extract data from a packet represented by an argument `b` with
type `packet_in`, a parser invokes the `extract` methods of `b`.
There are two variants of the `extract` method: a one-argument
variant for extracting fixed-size headers, and a two-argument variant
variant for extracting structs or fixed-size headers, and a two-argument variant
for extracting variable-sized headers. Because these operations can
cause runtime verification failures (see below), these methods can
only be executed within parsers.

When extracting data into a bit-string or integer, the first packet
bit is extracted to the most significant bit of the integer.
When extracting data into a bit-string or integer field of a header,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The struct itself won't tell you whether the extract has failed - it has no valid bit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the use cases in mind, I don't think that is an issue.

the first packet bit is extracted to the most significant bit of the
bit-string or integer. When extracting data into a struct, the order
of bits is completely target-dependent.

Some targets may perform cut-through packet processing, i.e., they may
start processing a packet before its length is known (i.e., before all
Expand Down Expand Up @@ -4923,19 +4925,19 @@ packet_in {

### Fixed width extraction { #sec-packet-extract-one }

The single-argument `extract` method handles fixed-width headers,
The single-argument `extract` method handles structs or fixed-width headers,
and is declared in P4 as follows:

~ Begin P4Example
void extract<T>(out T headerLeftValue);
void extract<T>(out T leftValue);
~ End P4Example

The expression `headerLeftValue` must evaluate to a l-value (see
Section [#sec-lvalues]) of type `header` with a fixed width. If
this method executes successfully, on completion the `headerLvalue`
is filled with data from the packet and its validity bit is set to `true`. This
The expression `leftValue` must evaluate to a l-value (see
Section [#sec-lvalues]) of type `header` with a fixed width, or of type `struct`. If
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A struct can contain any number of nested headers, each of which has may have a varbit.
This is becoming tricky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my comment was wrong - it says here it has to have fixed width.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says elsewhere explicitly that the bit format need not be fixed width.

I can make that more explicit here if you thought the English 'fixed with' modifier extends to cover the struct. By the insertion of a comma, I was vainly hoping to explicitly separate that.

this method executes successfully, on completion the `leftValue`
is filled with data from the packet, and if it is of type `header` then its validity bit is set to `true`. This
method may fail in various ways---e.g., if there are not
enough bits left in the packet to fill the specified header.
enough bits left in the packet to fill the specified `leftValue`.

For example, the following program fragment extracts an Ethernet header:

Expand All @@ -4949,7 +4951,7 @@ parser P(packet_in b, out Result r) {
~ End P4Example

In terms of the `ParserModel`, the semantics of the
single-argument `extract` is given in terms of the following
single-argument `extract` on a header type is given in terms of the following
pseudo-code method, using data from the `packet` class defined
above. We use the special `valid$` identifier to indicate the
hidden valid bit of a header, `isNext$` to indicate that the
Expand All @@ -4971,6 +4973,40 @@ void packet_in.extract<T>(out T headerLValue) {
}
~ End P4Pseudo

The semantics of the single-argument `extract` method on a struct type
is given below. This use of `extract` is only required to produce
predictable results if it is performed at the same offset from the
beginning of the packet that the same target device earlier performed
an `emit` method on a struct with the same type name. In this case,
the resulting value of `structLValue` will be equal to the original
struct value that was emitted, according to the `==` operator. If such
an `extract` operation is done in any other situation, the resulting
value of `structLValue` is unspecified.

The length in bits of the data consumed by such an `extract` operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this.
Packets are expected to be exchanged between machines. What's the point of emitting a packet if you don't know how it will look? If we want extract/emit for internal packets only we should use a different extern.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a different extern object that is neither packet_in nor packet_out?

Or do you mean a different method defined on those objects, with different names that are neither emit nor extract? @hanw Do you care what the method names are here? e.g. what if it were pkt.serialize(my_struct) instead of pkt.emit(my_struct), and pkt.deserialize(my_struct) instead of pkt.extract(my_struct)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extending packet_in and packet_out with serialize() and deserialize() methods sounds like a good idea to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it should be a different extern.
You can call it internal_buffer.
It can have an API which is the union of packet_in and packet_out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was re-reading the P4-16 spec again last week, a question came to mind: why do we need packet_in and packet_out extern at all? Why don't we define emit() and extract() to be function that operates on header, as P4-14 did?

For instance, it seems I could rewrite a P4-16 program with packet_in extern with a simple extract method.

parser p (packet_in pkt, header hdr) {
  state start {
     pkt.extract(hdr.ethernet);
  }
}

// can be rewritten as
parser p (header hdr) {
   state start {
      extract(hdr.ethernet);
   }
}

What does pass packet_in as a direction-less extern to parser buy us?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a separate extern may conflict with @hanw's goals, but I will let him speak to that. Then again, it may simplify things somewhat, since then all deserialized structs can go to "the same place" or "next to each other" in an implementation, yet "separate from" or "away from" the packet contents created via emit(my_header) calls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Han, to maybe clarify my previous comment, if there were a separate extern object on which serialize/deserialize calls were made, vs. on the packet_in/out objects, then there is no need to require an implementation to deal with deserialized struct data "between two emitted headers".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit is better than implicit.
The language is simpler, extern is not a keyword, it's a method.
Also, you can possibly handle multiple packet_in objects if your architecture exposes them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the multiple packet_in objects in the same parser idea, if packet_in is used to represent a stream of bytes that belong to the same packet, then multiple packet_in objects means multiple streams of bytes that belong to different packets. Essentially, it represents a parser block that is connected to multiple ports.

How do we implement the parse state machine, if there is only one start state, and how do we implement the arbitration logic between different ports?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For packet_out there is no issue.
For packet_in the architecture should provide information about the packet state, and it should specify the rules for invoking a parser (or some other block that has packet_in arguments). You could for example use the length of a packet to figure out whether it's present or not.

is not only target-dependent, but even for the same target and the
same struct name, the number of bits could vary in length across
different calls to `emit` and `extract`. For example, if the struct
contained a field with type `header_union`, an implementation may find
it advantageous to use a variable-length encoding. Also, a target may
choose to implement `emit` on a struct by first generating some
variable-length sequence of padding bits, so that later struct fields
start on a multiple of 64 bits from the beginning of the packet, for
target-specific efficiency reasons. The P4 programmer must not rely
on anything about the bit level encoding of a struct other than what
is specified above.

~ Begin P4Pseudo
void packet_in.extract<T>(out T structLValue) {
bitsToExtract = sizeofInBits(structLValue); // target-specific size
lastBitNeeded = this.nextBitIndex + bitsToExtract;
ParserModel.verify(this.lengthInBits >= lastBitNeeded, error.PacketTooShort);
// The format of data extracted into a struct is target-specific
structLValue = this.data.extractBits(this.nextBitIndex, bitsToExtract);
this.nextBitIndex += bitsToExtract;
}
~ End P4Pseudo

### Variable width extraction { #sec-packet-extract-two }

The two-argument `extract` handles variable-width headers, and is declared in P4 as follows:
Expand Down Expand Up @@ -5070,7 +5106,7 @@ as follows,
b.lookahead<T>()
~ End P4Example

where `T` must be a type with fixed width. In case of success the
where `T` must be a `header` type with fixed width, or a `struct` type. In case of success the
result of the evaluation of `lookahead` returns a value of type `T`.

In terms of the `ParserModel`, the semantics of `lookahead` is
Expand All @@ -5081,6 +5117,7 @@ T packet_in.lookahead<T>() {
bitsToExtract = sizeof(T);
lastBitNeeded = this.nextBitIndex + bitsToExtract;
ParserModel.verify(this.lengthInBits >= lastBitNeeded, error.PacketTooShort);
// The format of data looked ahead when returning a struct is target-specific
T tmp = this.data.extractBits(this.nextBitIndex, bitsToExtract);
return tmp;
}
Expand Down Expand Up @@ -5291,7 +5328,7 @@ P4 Runtime specification.
# Control blocks { #sec-control }

P4 parsers are responsible for extracting bits from a packet into
headers. These headers (and other metadata) can be manipulated and transformed within `control`
headers and/or structs. These (and other metadata) can be manipulated and transformed within `control`
blocks. The body of a control block
resembles a traditional imperative program. Within the body of a control block,
match-action units can be invoked to perform data
Expand Down Expand Up @@ -6134,8 +6171,25 @@ header, header stack, `struct`, or header union to the output packet.
the packet if it is valid and otherwise behaves like a no-op.
- When applied to a header stack, `emit` recursively invokes itself to
each element of the stack.
- When applied to a `struct` or header union, `emit` recursively
invokes itself to each field.
- When applied to a header union, `emit` recursively invokes itself to
each field.
- When applied to a struct, `emit` appends the data of the entire
struct to the packet in a target-specific format. There is no
requirement that fields be emitted in the order they appear in the
struct definition. The target is allowed to add padding. The struct
may contain member fields of any types allowed in a struct. See
Section [#sec-type-nesting] for a complete list. If the struct
contains headers, the format in which those nested headers is
emitted need not conform to the rules above when performing an
`emit` operation directly on a header.

The only requirements on the data format output by emitting a struct
are that if the same target device later does an `extract` operation
on the resulting packet, starting at the same offset within the packet
at which the `emit` was done, on a variable with the same struct type
name on which the `emit` was done, the resulting value of that
variable after the `extract` operation is equal to the original
emitted struct value, according to the `==` operator.

It is illegal to invoke `emit` on an expression of whose type is a
base type, `enum`, or `error`.
Expand All @@ -6152,30 +6206,32 @@ packet_out {
this.lengthInBits = 0;
}
/// Append data to the packet. Type T must be a header, header
/// stack, header union, or struct formed recursively from those types
/// stack, header union, or struct
void emit<T>(T data) {
if (isHeader(T))
if(data.valid$) {
this.data.append(data);
this.data.append(data); // in target-independent format
this.lengthInBits += data.lengthInBits;
}
else if (isHeaderStack(T))
for (e : data)
emit(e);
else if (isHeaderUnion(T) || isStruct(T))
else if (isHeaderUnion(T))
for (f : data.fields$)
emit(e.f)
else if (isStruct(T)) {
this.data.append(data); // in target-specific format
this.lengthInBits += data.lengthInBits;
}
// Other cases for T are illegal
}
~ End P4Pseudo

Here we use the special `valid$` identifier to indicate the hidden
valid bit of headers and `fields$` to indicate the list of fields
for a struct or header union. We also use standard `for` notation to
for header union. We also use standard `for` notation to
iterate through the elements of a stack `(e : data)` and list of
fields for header unions and structs `(f : data.fields$)`. The
iteration order for a struct is the order those fields appear in the
type declaration.
fields for header unions `(f : data.fields$)`.

# Architecture description { #sec-arch-desc }

Expand Down Expand Up @@ -7109,11 +7165,11 @@ error {
ParserTimeout /// Parser execution time limit exceeded.
}
extern packet_in {
/// Read a header from the packet into a fixed-sized header @hdr
/// Read from the packet into a fixed-sized header, or struct, @x,
/// and advance the cursor.
/// May trigger error PacketTooShort or StackOutOfBounds.
/// @T must be a fixed-size header type
void extract<T>(out T hdr);
/// @T must be a fixed-size header type or struct type
void extract<T>(out T x);
/// Read bits from the packet into a variable-sized header @variableSizeHeader
/// and advance the cursor.
/// @T must be a header containing exactly 1 varbit field.
Expand All @@ -7133,8 +7189,7 @@ extern packet_in {
extern packet_out {
/// Write @data into the output packet, skipping invalid headers
/// and advancing the cursor
/// @T can be a header type, a header stack, a header_union, or a struct
/// containing fields with such types.
/// @T can be a header type, a header stack, a header_union, or a struct.
void emit<T>(in T data);
}
action NoAction() {}
Expand Down