-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft of proposal to allow emit and extract on arbitrary structs #736
First draft of proposal to allow emit and extract on arbitrary structs #736
Conversation
I fully expect this proposed PR to be controversial, and if any version of it is ultimately accepted, will probably differ in notable ways from this initial draft. @hanw In reviewing the P4_16 language spec, I see something it says that you probably had fresh in your mind, that I had forgotten, in Section 15.1: "When applied to a struct or header union, emit recursively invokes itself to each field." If we make changes like we have been discussing, I think that "struct or" should be removed from that line of the spec, and the pseudocode for Sorry if my forgetting those parts of the spec were leading to confusion in our conversation. |
My initial draft is explicitly not backwards compatible with earlier P4_16 language specs in the following way: Earlier specs allowed you to define a struct that contained only member fields that were of type header, header stack, and/or header_union, and you could perform an emit operation, and the resulting bits appended to the packet would be in a format that was the same as if one did an emit on each of those fields, in the order the struct members appeared in the source code definition of the struct. This draft PR changes that, and explicitly allows a target device to emit a target-specific sequence of bits in this case. The up side is that it also allows structs to be emitted that contain arbitrary member field types. We could change this draft to "grandfather in" the former behavior of emitting a struct for the special case that was allowed in earlier versions, too. It seems that given how controversial this draft is already, it is worth considering the option of not preserving that behavior. |
inside of a struct definition.
The text needs to discuss how to support other methods in the |
I agree that What do you have in mind for changing This proposal is completely consistent with the existing proposal for sizeInBits() and sizeInBytes(), because that proposal is only defined for headers, not structs. :-) That proposal could be extended to say that sizeInBits() and sizeInBytes() are also defined on structs with a target-dependent return value, but do you see a need for such an operation?
|
If I am not sure if I need |
I agree that Is there anything in the P4_16 language spec that makes you confident you know what a target should return when you call the I can imagine defining |
Here is another reason not to define Suppose we did, and somebody thought about trying to There is nothing in this proposal that says that the length in bits of a For example, suppose someone wrote a P4_16 program that emitted this sequence of things:
Suppose you wanted to write the target-dependent format of s1_t so that aligned some of its internal fields on 32-bit boundaries. How would you do that with a constant length representation, if you did not know in advance which of h1 and h2 were valid? Suppose all 4 combinations of h1, h2 valid/invalid were possible. I could imagine perhaps implementing emit so that it first spit out 0, 8, 16, or 24 bits of garbage bits so that it was on a 32-bit aligned boundary, and then copied a target-dependent format for s1_t. There is no way with such an implementation of emit to get |
Note that the current proposed changes to the spec do allow emit and extract on a struct to do variable-length encodings of a struct. Why? Because all they say is that if you do extract on a struct, starting at an offset of the packet from which an emit was earlier done, it will work. It explicitly says that any other attempt to extract a struct will leave the struct variable with unspecified contents, i.e. garbage. A target like the one I describe above could implement extract on a struct by looking at the current offset, advancing to the next multiple of 32 bits, then copy out the struct contents in target-specific format, i.e. behave exactly as emit on that struct did, starting from the same offset from the beginning of the packet. |
Committed a 2-line change enabling packet_in.lookahead to return a struct as well as a fixed-length header. |
I took a stab at adding some text explicitly allowing targets to use variable-length encodings of structs for emit and extract. Feel free to comment if you think that is unreasonable, but for the example I mentioned in an earlier comment, I have a hard time believing that in such situations that all implementations would want to restrict themselves to constant-length bit representations, even for structs where it "looks like" all of the fields have constant length, because of alignment reasons (one of the reasons given for not wanting to use a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what problem this is solving.
for extracting variable-sized headers. Because these operations can | ||
cause runtime verification failures (see below), these methods can | ||
only be executed within parsers. | ||
|
||
When extracting data into a bit-string or integer, the first packet | ||
bit is extracted to the most significant bit of the integer. | ||
When extracting data into a bit-string or integer field of a header, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The struct itself won't tell you whether the extract has failed - it has no valid bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the use cases in mind, I don't think that is an issue.
this method executes successfully, on completion the `headerLvalue` | ||
is filled with data from the packet and its validity bit is set to `true`. This | ||
The expression `leftValue` must evaluate to a l-value (see | ||
Section [#sec-lvalues]) of type `header` with a fixed width, or of type `struct`. If |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A struct can contain any number of nested headers, each of which has may have a varbit.
This is becoming tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my comment was wrong - it says here it has to have fixed width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It says elsewhere explicitly that the bit format need not be fixed width.
I can make that more explicit here if you thought the English 'fixed with' modifier extends to cover the struct. By the insertion of a comma, I was vainly hoping to explicitly separate that.
an `extract` operation is done in any other situation, the resulting | ||
value of `structLValue` is unspecified. | ||
|
||
The length in bits of the data consumed by such an `extract` operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this.
Packets are expected to be exchanged between machines. What's the point of emitting a packet if you don't know how it will look? If we want extract/emit for internal packets only we should use a different extern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean a different extern object that is neither packet_in nor packet_out?
Or do you mean a different method defined on those objects, with different names that are neither emit nor extract? @hanw Do you care what the method names are here? e.g. what if it were pkt.serialize(my_struct) instead of pkt.emit(my_struct), and pkt.deserialize(my_struct) instead of pkt.extract(my_struct)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extending packet_in and packet_out with serialize() and deserialize() methods sounds like a good idea to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it should be a different extern.
You can call it internal_buffer.
It can have an API which is the union of packet_in and packet_out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I was re-reading the P4-16 spec again last week, a question came to mind: why do we need packet_in and packet_out extern at all? Why don't we define emit() and extract() to be function that operates on header, as P4-14 did?
For instance, it seems I could rewrite a P4-16 program with packet_in extern with a simple extract method.
parser p (packet_in pkt, header hdr) {
state start {
pkt.extract(hdr.ethernet);
}
}
// can be rewritten as
parser p (header hdr) {
state start {
extract(hdr.ethernet);
}
}
What does pass packet_in as a direction-less extern to parser buy us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a separate extern may conflict with @hanw's goals, but I will let him speak to that. Then again, it may simplify things somewhat, since then all deserialized structs can go to "the same place" or "next to each other" in an implementation, yet "separate from" or "away from" the packet contents created via emit(my_header) calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Han, to maybe clarify my previous comment, if there were a separate extern object on which serialize/deserialize calls were made, vs. on the packet_in/out objects, then there is no need to require an implementation to deal with deserialized struct data "between two emitted headers".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicit is better than implicit.
The language is simpler, extern is not a keyword, it's a method.
Also, you can possibly handle multiple packet_in objects if your architecture exposes them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the multiple packet_in objects in the same parser idea, if packet_in is used to represent a stream of bytes that belong to the same packet, then multiple packet_in objects means multiple streams of bytes that belong to different packets. Essentially, it represents a parser block that is connected to multiple ports.
How do we implement the parse state machine, if there is only one start state, and how do we implement the arbitration logic between different ports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For packet_out there is no issue.
For packet_in the architecture should provide information about the packet state, and it should specify the rules for invoking a parser (or some other block that has packet_in arguments). You could for example use the length of a packet to figure out whether it's present or not.
The motivation for writing this up started when @hanw was reviewing the proposal for sizeInBits/Bytes, and asking whether we could have the implementation of it delayed to the compiler back end, because he was considering creating a variant of a The use case, as far as I understand it, is only to pass data from one pipeline to another within a single device. For example, from ingress to egress pipeline. But it could also be used for recirculated, resubmitted, and cloned packets, potentially. In these situations, a target-specific bit format is perfectly acceptable. If someone sends out such structures on a physical port to another device, they deserve the trouble they have asked for. We could say so even more explicitly in the spec, if you think that the presence of the gun pointed in the direction of your foot and your finger hovering near the trigger is not yet explicit enough. :-) The implementation might be complex, yes, if there are fields of arbitrary types, nested arbitrarily, etc. The good news is, the only thing the spec needs to say about it is, effectively: emit followed by extract of the same type name of struct leaves you with an equal value to the original. |
There was a little bit of discussion of this issue at the 2019-Apr-01 LDWG meeting. There was no urgency to push this forward at this time (and perhaps not ever). Han mentioned that an alternate idea that has been used to good effect is to create a new annotation on a header definition that indicates that the target is allowed to rearrange the contents of the fields in a target-specific fashion for efficiency. |
In the interest of tidying up the set of active issues on the P4 specification repository, I'm marking this as "stalled" and closing it. Of course, we can always re-open it in the future if there is interest in resurrecting it. |
No description provided.