First draft of proposal to allow emit and extract on arbitrary structs #736

jafingerhut · 2019-02-22T01:50:19Z

No description provided.

jafingerhut · 2019-02-22T01:52:09Z

I fully expect this proposed PR to be controversial, and if any version of it is ultimately accepted, will probably differ in notable ways from this initial draft.

@hanw In reviewing the P4_16 language spec, I see something it says that you probably had fresh in your mind, that I had forgotten, in Section 15.1:

"When applied to a struct or header union, emit recursively invokes itself to each field."

If we make changes like we have been discussing, I think that "struct or" should be removed from that line of the spec, and the pseudocode for emit behavior for the case of isStruct(T) would need to be changed, perhaps to something like what is in this draft PR.

Sorry if my forgetting those parts of the spec were leading to confusion in our conversation.

jafingerhut · 2019-02-22T01:56:08Z

My initial draft is explicitly not backwards compatible with earlier P4_16 language specs in the following way:

Earlier specs allowed you to define a struct that contained only member fields that were of type header, header stack, and/or header_union, and you could perform an emit operation, and the resulting bits appended to the packet would be in a format that was the same as if one did an emit on each of those fields, in the order the struct members appeared in the source code definition of the struct.

This draft PR changes that, and explicitly allows a target device to emit a target-specific sequence of bits in this case. The up side is that it also allows structs to be emitted that contain arbitrary member field types.

We could change this draft to "grandfather in" the former behavior of emitting a struct for the special case that was allowed in earlier versions, too. It seems that given how controversial this draft is already, it is worth considering the option of not preserving that behavior.

inside of a struct definition.

hanw · 2019-02-25T23:04:03Z

The text needs to discuss how to support other methods in the packet_in extern on struct: advance, lookahead, length. In addition, the text should be consistent with the proposal for sizeInBits and sizeInBytes, and maybe clarify if using sizeInBit() with advance() for emitting struct make sense.

jafingerhut · 2019-02-25T23:17:57Z

advance need not change at all. It advances a specified number of bits.

I agree that lookahead could be extended to struct types in a pretty straightforward way consistent with the rest of this proposal. I can add that. It would of course do lookahead on the struct in whatever target-dependent format the struct is, just as extract does.

What do you have in mind for changing length? I can't think of any changes required there.

This proposal is completely consistent with the existing proposal for sizeInBits() and sizeInBytes(), because that proposal is only defined for headers, not structs. :-) That proposal could be extended to say that sizeInBits() and sizeInBytes() are also defined on structs with a target-dependent return value, but do you see a need for such an operation?

advance is one of the least useful methods in the packet_in object. Do you actually need it to skip over a struct with a target-specific size?

hanw · 2019-02-25T23:27:39Z

If length() represents the total number of bytes of a packet, then it could be target dependent in egress parser, if user emits a struct in ingress deparser for an architecture like PSA.

I am not sure if I need advance to skip over a struct with target-specific size. but if we define it, somebody is going to try it out. I could imagine doing pkt.advance(struct_inst.sizeInBytes()), which should work with today's definition for advance, but then we need to define struct_inst.sizeInBytes().

jafingerhut · 2019-02-25T23:38:25Z

I agree that length could be target-dependent if the ingress deparser emits a struct into a packet.

Is there anything in the P4_16 language spec that makes you confident you know what a target should return when you call the length method? I see almost nothing the spec says about it, except that targets need not support it. If we actually added text that said what length ought to return for existing cases, without being able to emit structs, I could imagine warning people about this new case, but what could they possibly get from the spec that they can rely on it returning a particular value today?

I can imagine defining sizeInBits() and sizeInBytes() for structs, but again, my question to you is, if nobody needs the capability, why not leave it out and avoid the headaches of defining it?

jafingerhut · 2019-02-26T00:03:37Z

Here is another reason not to define sizeInBits() for a struct, at least not yet until we think about it more, and perhaps not ever.

Suppose we did, and somebody thought about trying to advance over a target-dependent format of a struct that was emitted earlier.

There is nothing in this proposal that says that the length in bits of a struct is always the same number, on the same target, such that advance skipping over it could work correctly.

For example, suppose someone wrote a P4_16 program that emitted this sequence of things:

a header h1 with exactly 8 bits
a header h2 with exactly 8 bits
a struct with type name s1_t that required at least 100 bits of storage, regardless of target
a header h3 with exactly 40 bits

Suppose you wanted to write the target-dependent format of s1_t so that aligned some of its internal fields on 32-bit boundaries.

How would you do that with a constant length representation, if you did not know in advance which of h1 and h2 were valid? Suppose all 4 combinations of h1, h2 valid/invalid were possible.

I could imagine perhaps implementing emit so that it first spit out 0, 8, 16, or 24 bits of garbage bits so that it was on a 32-bit aligned boundary, and then copied a target-dependent format for s1_t.

There is no way with such an implementation of emit to get sizeInBits() and do an advance that skipped over s1_t's bits, and no more and no less, without also knowing the target-specific behavior of emit involving the alignment. I don't want to put anything like that in the P4_16 spec if I can avoid it. You can put it in your target's documentation if you like.

jafingerhut · 2019-02-26T00:06:50Z

Note that the current proposed changes to the spec do allow emit and extract on a struct to do variable-length encodings of a struct. Why? Because all they say is that if you do extract on a struct, starting at an offset of the packet from which an emit was earlier done, it will work. It explicitly says that any other attempt to extract a struct will leave the struct variable with unspecified contents, i.e. garbage. A target like the one I describe above could implement extract on a struct by looking at the current offset, advancing to the next multiple of 32 bits, then copy out the struct contents in target-specific format, i.e. behave exactly as emit on that struct did, starting from the same offset from the beginning of the packet.

jafingerhut · 2019-02-26T00:22:50Z

Committed a 2-line change enabling packet_in.lookahead to return a struct as well as a fixed-length header.

jafingerhut · 2019-02-27T20:43:33Z

I took a stab at adding some text explicitly allowing targets to use variable-length encodings of structs for emit and extract. Feel free to comment if you think that is unreasonable, but for the example I mentioned in an earlier comment, I have a hard time believing that in such situations that all implementations would want to restrict themselves to constant-length bit representations, even for structs where it "looks like" all of the fields have constant length, because of alignment reasons (one of the reasons given for not wanting to use a header type, that started this line of thinking).

mihaibudiu

I don't understand what problem this is solving.

mihaibudiu · 2019-03-01T18:29:39Z

p4-16/spec/P4-16-spec.mdk

 for extracting variable-sized headers. Because these operations can
 cause runtime verification failures (see below), these methods can
 only be executed within parsers.

-When extracting data into a bit-string or integer, the first packet
-bit is extracted to the most significant bit of the integer.
+When extracting data into a bit-string or integer field of a header,


The struct itself won't tell you whether the extract has failed - it has no valid bit.

For the use cases in mind, I don't think that is an issue.

mihaibudiu · 2019-03-01T18:30:16Z

p4-16/spec/P4-16-spec.mdk

-this method executes successfully, on completion the `headerLvalue`
-is filled with data from the packet and its validity bit is set to `true`. This
+The expression `leftValue` must evaluate to a l-value (see
+Section [#sec-lvalues]) of type `header` with a fixed width, or of type `struct`. If


A struct can contain any number of nested headers, each of which has may have a varbit.
This is becoming tricky.

I guess my comment was wrong - it says here it has to have fixed width.

It says elsewhere explicitly that the bit format need not be fixed width.

I can make that more explicit here if you thought the English 'fixed with' modifier extends to cover the struct. By the insertion of a comma, I was vainly hoping to explicitly separate that.

mihaibudiu · 2019-03-01T18:31:20Z

p4-16/spec/P4-16-spec.mdk

+an `extract` operation is done in any other situation, the resulting
+value of `structLValue` is unspecified.
+
+The length in bits of the data consumed by such an `extract` operation


I don't like this.
Packets are expected to be exchanged between machines. What's the point of emitting a packet if you don't know how it will look? If we want extract/emit for internal packets only we should use a different extern.

Do you mean a different extern object that is neither packet_in nor packet_out?

Or do you mean a different method defined on those objects, with different names that are neither emit nor extract? @hanw Do you care what the method names are here? e.g. what if it were pkt.serialize(my_struct) instead of pkt.emit(my_struct), and pkt.deserialize(my_struct) instead of pkt.extract(my_struct)?

Extending packet_in and packet_out with serialize() and deserialize() methods sounds like a good idea to me.

No, it should be a different extern.
You can call it internal_buffer.
It can have an API which is the union of packet_in and packet_out.

While I was re-reading the P4-16 spec again last week, a question came to mind: why do we need packet_in and packet_out extern at all? Why don't we define emit() and extract() to be function that operates on header, as P4-14 did?

For instance, it seems I could rewrite a P4-16 program with packet_in extern with a simple extract method.

parser p (packet_in pkt, header hdr) { state start { pkt.extract(hdr.ethernet); } } // can be rewritten as parser p (header hdr) { state start { extract(hdr.ethernet); } }

What does pass packet_in as a direction-less extern to parser buy us?

Having a separate extern may conflict with @hanw's goals, but I will let him speak to that. Then again, it may simplify things somewhat, since then all deserialized structs can go to "the same place" or "next to each other" in an implementation, yet "separate from" or "away from" the packet contents created via emit(my_header) calls.

Han, to maybe clarify my previous comment, if there were a separate extern object on which serialize/deserialize calls were made, vs. on the packet_in/out objects, then there is no need to require an implementation to deal with deserialized struct data "between two emitted headers".

Explicit is better than implicit.
The language is simpler, extern is not a keyword, it's a method.
Also, you can possibly handle multiple packet_in objects if your architecture exposes them.

Regarding the multiple packet_in objects in the same parser idea, if packet_in is used to represent a stream of bytes that belong to the same packet, then multiple packet_in objects means multiple streams of bytes that belong to different packets. Essentially, it represents a parser block that is connected to multiple ports.

How do we implement the parse state machine, if there is only one start state, and how do we implement the arbitration logic between different ports?

For packet_out there is no issue.
For packet_in the architecture should provide information about the packet state, and it should specify the rules for invoking a parser (or some other block that has packet_in arguments). You could for example use the length of a packet to figure out whether it's present or not.

jafingerhut · 2019-03-01T18:50:22Z

The motivation for writing this up started when @hanw was reviewing the proposal for sizeInBits/Bytes, and asking whether we could have the implementation of it delayed to the compiler back end, because he was considering creating a variant of a header type that had a target-specific padding/alignment between fields, and perhaps even different order for fields than appeared in the header definition.

The use case, as far as I understand it, is only to pass data from one pipeline to another within a single device. For example, from ingress to egress pipeline. But it could also be used for recirculated, resubmitted, and cloned packets, potentially.

In these situations, a target-specific bit format is perfectly acceptable.

If someone sends out such structures on a physical port to another device, they deserve the trouble they have asked for. We could say so even more explicitly in the spec, if you think that the presence of the gun pointed in the direction of your foot and your finger hovering near the trigger is not yet explicit enough. :-)

The implementation might be complex, yes, if there are fields of arbitrary types, nested arbitrarily, etc. The good news is, the only thing the spec needs to say about it is, effectively: emit followed by extract of the same type name of struct leaves you with an equal value to the original.

jafingerhut · 2019-04-02T11:22:00Z

There was a little bit of discussion of this issue at the 2019-Apr-01 LDWG meeting. There was no urgency to push this forward at this time (and perhaps not ever). Han mentioned that an alternate idea that has been used to good effect is to create a new annotation on a header definition that indicates that the target is allowed to rearrange the contents of the fields in a target-specific fashion for efficiency.

jnfoster · 2023-11-11T13:49:36Z

In the interest of tidying up the set of active issues on the P4 specification repository, I'm marking this as "stalled" and closing it. Of course, we can always re-open it in the future if there is interest in resurrecting it.

First draft of proposal to allow emit and extract on arbitrary structs

e1d2184

jafingerhut requested a review from hanw February 22, 2019 01:50

Cross-reference the section that explicitly lists all types allowed

c290d64

inside of a struct definition.

Enable packet_in.lookahead method to return a struct

841bdf8

hanw mentioned this pull request Feb 26, 2019

implementation for serializable struct in the frontend p4lang/p4c#1722

Closed

mihaibudiu approved these changes Feb 26, 2019

View reviewed changes

Add text explicitly allowing variable-length struct representations

febf3c6

mihaibudiu reviewed Mar 1, 2019

View reviewed changes

mihaibudiu added the future enhancement label Apr 8, 2019

jnfoster closed this Nov 11, 2023

jnfoster added the stalled label Nov 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First draft of proposal to allow emit and extract on arbitrary structs #736

First draft of proposal to allow emit and extract on arbitrary structs #736

jafingerhut commented Feb 22, 2019

jafingerhut commented Feb 22, 2019

jafingerhut commented Feb 22, 2019 •

edited

Loading

hanw commented Feb 25, 2019 •

edited

Loading

jafingerhut commented Feb 25, 2019

hanw commented Feb 25, 2019

jafingerhut commented Feb 25, 2019

jafingerhut commented Feb 26, 2019

jafingerhut commented Feb 26, 2019

jafingerhut commented Feb 26, 2019

jafingerhut commented Feb 27, 2019

mihaibudiu left a comment

mihaibudiu Mar 1, 2019

jafingerhut Mar 1, 2019

mihaibudiu Mar 1, 2019

mihaibudiu Mar 1, 2019

jafingerhut Mar 1, 2019

mihaibudiu Mar 1, 2019

jafingerhut Mar 1, 2019

hanw Mar 1, 2019

mihaibudiu Mar 1, 2019

hanw Mar 1, 2019

jafingerhut Mar 1, 2019

jafingerhut Mar 1, 2019

mihaibudiu Mar 1, 2019

hanw Mar 1, 2019

mihaibudiu Mar 2, 2019

jafingerhut commented Mar 1, 2019

jafingerhut commented Apr 2, 2019

jnfoster commented Nov 11, 2023

First draft of proposal to allow emit and extract on arbitrary structs #736

First draft of proposal to allow emit and extract on arbitrary structs #736

Conversation

jafingerhut commented Feb 22, 2019

jafingerhut commented Feb 22, 2019

jafingerhut commented Feb 22, 2019 • edited Loading

hanw commented Feb 25, 2019 • edited Loading

jafingerhut commented Feb 25, 2019

hanw commented Feb 25, 2019

jafingerhut commented Feb 25, 2019

jafingerhut commented Feb 26, 2019

jafingerhut commented Feb 26, 2019

jafingerhut commented Feb 26, 2019

jafingerhut commented Feb 27, 2019

mihaibudiu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jafingerhut commented Mar 1, 2019

jafingerhut commented Apr 2, 2019

jnfoster commented Nov 11, 2023

jafingerhut commented Feb 22, 2019 •

edited

Loading

hanw commented Feb 25, 2019 •

edited

Loading