Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle p4-14 intrinsic_metadata #1704

Merged
merged 11 commits into from
Apr 25, 2019
Merged

Conversation

mihaibudiu
Copy link
Contributor

@mihaibudiu mihaibudiu commented Jan 30, 2019

This is an attempt to fix #1694. This is done by adding all intrinsic_metadata p4-14 fields as fields to standard_metadata in v1model, and rewriting any accesses to intrinsic_metadata as accesses to standard_metadata.

I had to add to v1model.p4 a bunch of intrinsic_metadata fields that appear in switch.p4 that appear nowhere else. I have also deleted a benchmark (wide_action2.p4) which used a bizarre intrinsic_metadata field.

I have also tried to add automatically all unknown fields to the standard_metadata, but unfortunately this means that the p4-16 program generated from a p4-14 program does not compile anymore, since it will use v1model's standard_metadata, which does not include the original intrinsic_metadata anymore.

I wonder whether the @alias annotations are necessary; if they aren't maybe we can later remove them altogether. Hopefully, if this is right, it will help us close a few other outstanding issues.

/// Error produced by parsing
error parser_error;

// The following fields are not part of the P4-14 standard, but they seem
// to be supported by BMv2 and required by switch.p4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked the latest behavioral-model source code, and except for intrinsic_metadata.priority, none of the fields below are mentioned anywhere in its implementation.

All of the fields above aliased to intrinsic_metadata and queueing_metadata fields are present in the behavioral-model simple_switch code.

I suspect that the fields below (the ones besides intrinsic_metadata.priority) are simply some names picked by whoever was writing switch.p4, and were never added to BMv2. They might be part of some other P4_14 implementation somewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the fields below are added to v1model.p4, I would suggest making the comment something more like this (and move intrinsic_metadata.priority above this comment).

// The following fields are not part of the P4-14 standard, nor are they recognized or treated specially
// by BMv2. They are here solely for the purpose of compiling some historical versions of switch.p4
// in the p4c test suite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a commit which does this.

@jafingerhut
Copy link
Contributor

Regarding this comment: "I wonder whether the @alias annotations are necessary; if they aren't maybe we can later remove them altogether"

At least as BMv2 simple_switch is written right now, it looks for fields in a header/struct named 'intrinsic_metadata' (the ones that right now have such an @alias annotation in v1model.p4). If the aliases are removed, the only way programs relying on these field would continue working is if BMv2 simple_switch was changed to either: (a) only look for these fields in a header/struct named 'standard_metadata', or (b) look for these fields in both 'intrinsic_metadata' and 'standard_metadata' (that would potentially make for a smoother transition for p4c/simple_switch versions after the change was made)

@mihaibudiu
Copy link
Contributor Author

We can have aliases in JSON without having them in the p4 source for v1model. The compiler has to know enough about the architecture - i.e., what the intrinsic_metadata fields are.

@jafingerhut
Copy link
Contributor

So does this change effectively combine standard_metadata, intrinsic_metadata, and queueing_metadata fields all together into standard_metadata, thus requiring that the names of fields in these three structs to be disjoint in a P4_14 program?

Does it only do this for P4_14 source programs?

Does it only do this when auto-translating P4_14 to P4_16 source code, or even when compiling P4_14 source code to any other target?

@mihaibudiu
Copy link
Contributor Author

This is only done when translating p4-14 programs; the intrinsic_metadata fields are folded into the standard_metadata. P4-16 programs need no special treatment - they should use the standard_metadata fields already.

@mihaibudiu
Copy link
Contributor Author

We don't do anything yet for queueing_metadata, I had no example programs using such a structure.
The fields are already distinct because they must be the fields in v1model.p4.

@mihaibudiu
Copy link
Contributor Author

p4-14 allows you to declare which intrinsic_metadata fields you want to use in your program. Apparently they have to be in an instance named "intrinsic_metadata", which is then magically populated by the runtime on input. Hopefully you will declare the same types as in the standard_metadata, we don't check that you do. We could, but some programs actually use different types... they fortunately work; perhaps the compiler inserts the requisite casts.

@mihaibudiu
Copy link
Contributor Author

We didn't strictly need to change standard_metadata in v1model, but our tests attempt to compile the P4-16 version of each P4-14 program, and that doesn't work if we don't.

Copy link
Contributor

@hanw hanw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to verify this has no impact on our end, please do not merge yet.

jafingerhut added a commit to jafingerhut/p4c that referenced this pull request Feb 5, 2019
It is nearly identical to the test program p414-special-ops.p4
proposed to be added in the PR p4lang#1704, except that program explicitly
lists individual fields to be preserved during resubmit
etc. operations, whereas this one gives the name of an entire metadata
'header' to be preserved.

For p414-special-ops-2-bmv2.p4, p4c as of this writing introduces
temporary metadata variables and preserves those instead of the ones
named in the source file, and since the STF packet tests cases rely
for their correct operation on the metadata fields being preserved,
they fail.
jafingerhut added a commit to jafingerhut/p4c that referenced this pull request Feb 5, 2019
It is nearly identical to the test program p414-special-ops.p4
proposed to be added in the PR p4lang#1704, except this program explicitly
modifies metadata fields after a resubmit, recirculate, or clone
operation is performed, and the STF tests will only pass if the
behavior of the packet processing is that the metadata fields are
preserved at the _end_ of ingress/egress processing, _not_ the values
that the metadata fields have at the time of the
recirculate/resubmit/clone primitive operation call.

This is consistent with the P4_14 language specification, at least as
of the clarifications made in v1.0.5 of the specification.
mihaibudiu pushed a commit that referenced this pull request Feb 5, 2019
* Add XFAIL P4_14 test program that fails because of issue #1669

It is nearly identical to the test program p414-special-ops.p4
proposed to be added in the PR #1704, except that program explicitly
lists individual fields to be preserved during resubmit
etc. operations, whereas this one gives the name of an entire metadata
'header' to be preserved.

For p414-special-ops-2-bmv2.p4, p4c as of this writing introduces
temporary metadata variables and preserves those instead of the ones
named in the source file, and since the STF packet tests cases rely
for their correct operation on the metadata fields being preserved,
they fail.

* Add expected compiler output files for new test program
mihaibudiu pushed a commit that referenced this pull request Feb 5, 2019
* Add XFAIL P4_14 test program that fails because of issue #1694

It is nearly identical to the test program p414-special-ops.p4
proposed to be added in the PR #1704, except this program explicitly
modifies metadata fields after a resubmit, recirculate, or clone
operation is performed, and the STF tests will only pass if the
behavior of the packet processing is that the metadata fields are
preserved at the _end_ of ingress/egress processing, _not_ the values
that the metadata fields have at the time of the
recirculate/resubmit/clone primitive operation call.

This is consistent with the P4_14 language specification, at least as
of the clarifications made in v1.0.5 of the specification.

* Add expected compiler output
@jafingerhut
Copy link
Contributor

jafingerhut commented Feb 20, 2019

We need to verify this has no impact on our end, please do not merge yet.

@hanw Any indication yet whether these changes will work for you? Or if they do not, some alternate suggestion on how to handle this issue for P4_14 source programs for p4c?

@mihaibudiu
Copy link
Contributor Author

Can someone please review this PR?

@jafingerhut
Copy link
Contributor

Sorry to be annoying, but trying a monthly ping :-)

I know that perhaps supporting p4c as a P4_14 v1model compiler is not high priority, but without this change, or something similarly effective, a whole bunch of P4_14 features are simply broken, using p4c and simple_switch, e.g.:

  • multicast
  • resubmit
  • recirculate
  • generating digests
  • timestamps

In general, any features that require access to the P4_14 structs "intrinsic_metadata" or "queueing_metadata" described here: https://github.com/p4lang/behavioral-model/blob/master/docs/simple_switch.md

@mihaibudiu
Copy link
Contributor Author

We actually have 2 separate PRs that fix these issues.
I thought a bit about it, and I think we should just merge them.
The intrinsic metadata one is a bit less controversial; the only change I don't like about it is adding some standard_metadata fields to v1model.
The recirculate fix we should merge even if you don't like it that much because it would allow people to write programs using recirculate. If you want to have an alternate implementation (e.g., using externs), we can add that on top of the existing one later as well. We can support doing this in multiple ways.

@mihaibudiu
Copy link
Contributor Author

@hanw you are blocking this PR.

@hanw
Copy link
Contributor

hanw commented Apr 8, 2019

A potential issue with adding intrinsic metadata to standard_metadata is that some intrinsic metadata only makes sense in egress or ingress pipeline. However, adding them to standard metadata will allow user to use them in both pipelines, and results in programs that cannot be compiled.

@mihaibudiu
Copy link
Contributor Author

We already have this problem.

@jafingerhut
Copy link
Contributor

I believe that right now, the open source p4c and bmv2 will let you access any of the standard_metadata, intrinsic_metadata, or queuing_metadata fields described on this simple_switch documentation page, from both ingress and egress pipelines, with no error or warning messages: https://github.com/p4lang/behavioral-model/blob/master/docs/simple_switch.md

Would it be better for users if they got a compilation error if they try to access egress_port during the ingress pipeline code? Yes, I think it would. But for P4_14 and v1model code, since all of these fields are lumped together in a few big structs that are accessible in both ingress and egress pipelines, I do not see any way to generate such error/warning messages using existing front-end machinery to help you.

Perhaps such error/warning messages could be added in the bmv2 v1model back end code of the compiler, or at run-time in bmv2, but it seems like each field would need its own custom checks for whether it was only meant to be accessed in ingress, only in egress, or either, and whether it was supposed to be read-only, write-only, or some mix of both.

@mihaibudiu
Copy link
Contributor Author

In P4-16 (and P4-14 for that matter) it is actually legal to access these fields in any pipeline, and the compiler should make sure that they do travel between pipelines. What is not apparent is that there is a cost of propagating these values between pipelines. It may be that our implementation does not generate proper code for bmv2, though, regarding these fields.

@jafingerhut
Copy link
Contributor

There are zero good reasons in the v1model architecture to access the egress_port field in the ingress control. I would go so far as to say that a program that does so was written by a person that does not properly understand the v1model architecture, and is almost certainly a bug in their program.

The same goes for many other possible accesses to fields in v1model's standard_metadata struct.

I am not saying the p4c compiler must give errors and warnings in all of these cases, but if it did, more people would learn those details of the v1model architecture more quickly.

/// queue depth at the packet dequeue time.
@alias("queueing_metadata.deq_qdepth")
bit<19> deq_qdepth;

// intrinsic metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend not adding standard metadata fields to v1model just because they are used exclusively in switch.p4_14:

  • makes things more complex
  • leads to some non-sensical situations (what's the difference between ucast_egress_port and egress_spec / egress_port)
  • switch.p4 is not maintained anymore, it is in a frozen state and people compiling it are very likely to be compiling it in the context of p4lang/p4factory with the legacy P4_14 compiler
  • I know it was mentioned at the meeting that we use some older copy of switch.p4 for compiler regressions, but at this stage I would imagine we have good enough coverage without it

It is probably a good opportunity to remove v1model standard metadata fields which are used internally by bmv2 simple_switch and should not be programmed directly by the user: clone_spec, lf_field_list, resubmit_flag, recirculate_flag. That would also resolve this issue: #1845. The bmv2 backend can synthesize these fields when generating the JSON.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we could surgically edit switch.p4 to remove the use of these fields. Then we can keep the remaining program, and hopefully it still offers some value. But I haven't looked to see how they are used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, but it does not look particularly easy; these fields are used in a bunch of places.
The compiler is already quite slow, but we should have at least one large program to make sure we don't make it unacceptably slow. For example, on a large program a quadratic-time algorithm will show up clearly.

Copy link
Member

@antoninbas antoninbas Apr 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would fabric.p4 (https://github.com/opennetworkinglab/onos/blob/master/pipelines/fabric/src/main/resources/fabric.p4) with all features enabled (INT & SPGW) be an acceptable replacement? It is not as big, but at least it is actively maintained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have fabric even if we keep switch.p4.
If we can make it run some STF tests then it may be a good enough replacement.
I won't object to deleting switch.p4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just gone through these new fields to see which might be easy to eliminate in the p4c test programs, if people are interested in trying to do that.

I will start by listing the easy ones to get rid of. Later below I will describe the ones that are only removable with changes to the test programs that I don't know the impact of. It kind of depends on whether anyone actually uses the 2 versions of switch.p4 in the p4lang/p4c repository for anything other than testing p4c.

  • ucast_egress_port

only occurs in in this one test program: p4_14_samples/sai_p4.p4 Easily replaced with standard_metadata.egress_spec

  • ingress_global_tstamp

Can easily be replaced with intrinsic_metadata.ingress_global_timestamp, which is the current name used in v1model.p4. Same bit width, too.

  • enq_tstamp

Easily replaceable with queueing_metadata.enq_timestamp, which v1model already has.

  • enq_congest_stat and deq_congest_stat

switch.p4 programs do not mention this field other than to define it. Thus easily removed.

  • mcast_hash

Defined in about a dozen test programs, P4_14 and P4_16, and both switch.p4 versions. It is only assigned a value in one or two places in switch.p4, and is an output of the ingress control, intended to be used by the packet buffer / traffic manager. simple_switch ignores it, so deleting it and commenting out (or perhaps #ifdef-ing out) those few assignments to it would be a very small change to switch.p4.

  • deflection_flag and deflect_on_drop

Only used in the two versions of switch.p4. There is nothing similar to these in v1model to replace them with. They could be #ifdef'd out, as suggested for mcast_hash, since they seem to be used in only a few lines of code.

I could make a PR with the changes suggested above, in case people think they would be useful to look at and consider merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is very impressive! If indeed these are the only uses then we could potentially keep switch.p4 as well. I looked at the hash, and it seemed to have a bunch of uses, so I got discouraged.

@mihaibudiu mihaibudiu force-pushed the issue1694 branch 2 times, most recently from f081ad7 to e4e823b Compare April 22, 2019 17:59
@mihaibudiu mihaibudiu requested a review from jafingerhut April 22, 2019 20:45
bit<1> deflect_on_drop;
/// time snapshot taken when the packet is enqueued (in nsec).
@alias("intrinsic_metadata.enq_tstamp")
bit<32> enq_tstamp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was said during LDWG meeting, but my main comment is that I hope that the rest of these changes still pass all tests if you remove deflect_on_drop and enq_tstamp from this struct definition. I have tried to remove all uses of those fields from all p4c test programs, so I have some hope that this is true. I am happy to analyze whatever fails, if anything fails, and get back to you quickly if I see a way to further edit the test programs to eliminate the use of these fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run p4c tests locally on my dev machine with this PR's changes, plus removing deflect_on_rop and enq_tstamp. The only failing tests are due to line number changes in 2 error messages involving v1model.p4 line numbers (expected, and easily fixed in this PR), and about a dozen p4test programs that have different mid-end P4_16 source code because of these 2 removed fields, again expected and easily fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will go ahead and approve this, and if my suggested changes aren't made before merging this PR, I can create another one afterwards that makes them.

/// multicast group id (key for the mcast replication table)
@alias("intrinsic_metadata.mcast_grp")
bit<16> mcast_grp;
/// flag distinguishing original packets from resubmitted packets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "flag distinguishing original packets from resubmitted packets" is not correct. The field 'instance_type' is used for that purpose in v1model.

resubmit_flag is one of 4 fields that Antonin has suggested, and I like the idea, that they should be removed from standard_metadata in the near future, and instead made internal implementation details inside of BMv2.

One alternative is to remove the comment. If you would prefer a more accurate one, I would replace it with something like this: "/// resubmit metadata field list id, or 0 if no resubmit operation has been performed"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are at all curious to learn more about some of the BMv2 v1model implementation details, you can read them here: https://github.com/p4lang/behavioral-model/blob/master/docs/simple_switch.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some tests (that you wrote) which use resubmit_flag.
So for now I won't delete it; I will update the comment.
We can hopefully remove the field in a subsequent commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine to me. We are not yet ready to delete the field -- it is still needed for BMv2 v1model to work correctly right now. There are changes I have in process on a BMv2 PR that will make it unnecessary in the future, and we can remove it then.

Copy link
Contributor

@jafingerhut jafingerhut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMGTM (Looks Mostly Good To Me). As mentioned in another comment, I am happy to fix up the few minor nits that remain in a later PR.

@@ -41,12 +41,15 @@ metadata ingress_intrinsic_metadata_t intrinsic_metadata;
#define _ingress_global_tstamp_ intrinsic_metadata.ingress_global_timestamp
#define modify_field_from_rng(_d, _w) modify_field_rng_uniform(_d, 0, (1<<(_w))-1)

<<<<<<< 5b10de115548717dd47418bece7b7ca011f0df5f:testdata/p4_14_samples/switch_20160512/includes/intrinsic.p4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing this and the similar merge conflict indicator lines should cause compilation of switch.p4 to fail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly missed when we added this syntax to the language spec :-)

@mihaibudiu
Copy link
Contributor Author

I will merge this PR. @hanw has indicated in January that he will check for problems, but he hasn't reported any, so I am assuming that this is safe. This anyway fixes a longstanding bug in the p4-14 to p4-16 translation.

@mihaibudiu mihaibudiu merged commit ae6fb22 into p4lang:master Apr 25, 2019
@mihaibudiu mihaibudiu deleted the issue1694 branch April 25, 2019 23:46
peteli3 pushed a commit to peteli3/p4c that referenced this pull request Jun 6, 2019
* Handle p4-14 intrinsic_metadata
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

p4c mangles P4_14 intrinsic_metadata header in BMv2 JSON file, preventing resubmit from working
5 participants