Max's async Snabb changes + public packet_t #997

kbara · 2016-08-23T10:40:05Z

From #950 (async changes, in accordance with Luke's vision in #656) and #996 (making packet_t public, in accordance with documentation and to decrease redundant ctype pollution).

lukego · 2016-08-24T11:18:39Z

@kbara Very interesting benchmarks results for this branch on next-tributary. Mixed but broadly positive. I have defined a new next-tributary-murren jobset to benchmark more configurations on generic hardware too.

Can be that the effect of increasing the struct link size is related to the effect of increasing the virtio vring size with a patched QEMU in #976 (comment).

Let me just dig a bit to make sure the performance impact is an overall net positive. I'd love to understand what causes the differences. Plug: Snabb Studio prototype demo that may become a game changer here :).

kbara · 2016-08-24T11:38:04Z

Wow, it looks like tooling is getting really exciting. :-)

lukego · 2016-08-25T08:40:47Z

Gee it is fun browsing the benchmark datasets that Hydra produces automagically :).

Here is a full matrix benchmark report that includes kbara-next. This is testing many different configurations (NFV features for iperf, Virtio-net features for l2fwd) on generic hardware (Hetzner "murren" servers w/o hardware NIC).

Observations:

Overall summary looks good. Seems like performance is becoming better and more consistent: the kbara-next (red) line has a new hump that seems to have been formed by some scores increasing (moving right).
Split by benchmark shows different effects for iperf (more zero scores i.e. test failures) and for l2fwd (more high scores at ~6.5 Mpps).
iperf by configuration seems to show a speedup on both the base and l2tpv3 cases (good) but a 100% failure rate on the filter case (don't worry, innocent explanation below).
l2fwd by configuration shows definite speedups on packet forwarding with the base and noind configurations (less clear on nomrg). Good stuff and closely related to the QEMU vring patch that I have been exploring (in fact I will include kbara-next in those tests to see this side-by-side and in combination with the patched QEMU too).

So overall this looks great. The failures with iperf when packet filtering is enabled would in principle be a blocker but I know this is because a specific fix (commit 2a98aa5) has not landed in kbara-next yet:

$ git merge-base --is-ancestor 2a98aa5 snabbco/kbara-next ; echo $?
1

So I am confident that merging this code into next will improve performance overall. Great stuff 👍

Couple more things I want to look at before pressing merge:

How come the basic1 benchmark score went down? any idea @eugeneia? This is not a deal-breaker since it is a very synthetic benchmark but could be a sign of a real problem so I don't want to take a regression there too lightly.
Gather my thoughts on the packet_t API change.

Note: In a perfect world my next-tributary jobset would not be testing the tributary branches directly but rather test next with each tributary branch merged. Then the results would tell me what the consequences are of merging the branch with next and e.g. we would not see the iperf "regression" because the fix is already present on next and would be included in the test. However, Hydra doesn't work that way at the moment, and it's not the end of the world because we will at least find problems once they land on next i.e. before release.

kbara · 2016-08-25T09:55:37Z

Given that the changes get merged onto next, it sounds like the easy solution to that problem is to simply make sure that kbara-next (and presuambly max-next) frequently merge next into themselves?

eugeneia · 2016-08-25T11:49:11Z

@lukego I think basic1 score went down because it is processing less packets per breath due to being limited to engine.pull_npackets (see https://github.com/snabbco/snabb/pull/997/files#diff-9f48c75e1c64e2dff09340bf4a6d9815L22). Before it would have filled up the whole link on each breath.

eugeneia · 2016-08-25T12:09:06Z

Might actually make sense to revert the change I referred to above and see how that affects basic1. A case could be made for the original behavior, since Source, being a load generator, should keep its output links saturated? On the other hand, limiting to engine.pull_npackets emulates the behavior of a NIC app and might yield more realistic results. We can now also tweak engine.pull_npackets and see how that affects the various benchmarks.

lukego · 2016-08-26T11:36:00Z

@eugeneia Thanks for that explanation. I think it is okay for basic1 benchmark scores to decrease if the reason is that it is processing fewer packets per breath. That benchmark mostly exists to catch regressions on basic operations like packet allocation, link transmit/receive, and so on. Could even be positive that it is processing smaller batches because then it will also be more sensitive to the performance of the engine itself e.g. switching between apps, polling timers, etc.

lukego · 2016-08-26T14:52:04Z

Merged, thanks! And sorry about the wait.

I am actually not sold on putting packet_t into the API and wonder if we should revert that:

Redundancy: We can already refer to the packet struct by the name struct packet.
Consistency: The name struct packet works in Lua, C, and assembly.
Necessity: Does this really solve an "excessive Ctypes" problem? I don't think that referencing an existing struct by name creates a new ctype.

Here is a quick experiment suggesting to me that ffi.typeof() will return the same Ctype for multiple calls if you are referencing an existing definition (rather than e.g. defining an anonymous struct):

$ sudo ./snabb snsh -i
Snabb> ffi = require("ffi")
Snabb> a = ffi.typeof("struct packet")
Snabb> b = ffi.typeof("struct packet")
Snabb> =a
ctype<struct packet>
Snabb> =b
ctype<struct packet>
Snabb> =(a==b)
true

So I wonder if there is really a problematic case that will be solved with packet_t?

eugeneia · 2016-08-26T15:19:25Z

Good analysis. My mistake thinking ffi.typeof would create a new ctype every time. I guess reverting cbed511 and updating src/README to refer to struct packet instead of packet.packet_t is the way forward then?

lukego · 2016-08-26T15:41:29Z

I can do that over on next.

lukego · 2016-08-26T15:43:32Z

btw it is very good that you have your eyes peeled for "ctype diversity" problems :). This is relatively easy to fix once you know what to look for but boy did @alexandergall do a lot of work to discover this from first principles over in #612. Finding such issues in submitted PRs seems like a really valuable bit of review.

kbara · 2016-08-27T15:44:15Z

Oops, should've verified that it actually caused a ctype diversity problem. I'm happy to merge a reversion.

Grabbing revert of the packet_t type since we decided to go with 'struct packet' directly instead. See discussion at #997 (comment)

lukego · 2016-08-29T15:24:51Z

Given that the changes get merged onto next, it sounds like the easy solution to that problem is to simply make sure that kbara-next (and presuambly max-next) frequently merge next into themselves?

@kbara Interesting thought. This seems like it would work well from my perspective. I suppose we would have to try it to understand all the implications. I would be happy to pull from next-tributary branches that have merged from next.

(Separately I still think it makes sense for new features and fixes to be based on master rather than next when possible, but that is the matter of creating new changes rather then merging and integrating them.)

kbara · 2016-08-30T10:27:24Z

Sounds reasonable. I figure it has handy side effects, like Hydra performance evaluations being based on the combination of patches we'll actually merge; usually it won't matter, sometimes it will.

Rename ExprAlarm to CallbackAlarm

eugeneia and others added 10 commits June 15, 2016 14:34

remove link.full, increase link buffer size, add engine.pull_npackets.

54e7b90

Replace usage of link.full/nwritable with engine.pull_npackets.

a1f60a2

Fix-up Lwaftrgen:pull and basic_apps.Tee.

55b1572

core.link.size must be a power of two.

c3c3413

Revert "remove link.full", document link.nwritable, link.nreadable.

f52861f

Merge branch 'master' into remove-backpressure-local

e1eb42f

Merge branch 'master' into remove-backpressure

c283589

core.packet: make packet_t public accoding to documentation.

0214574

Merge PR #950 (Max: Async Snabb) into kbara-next

10223f4

Merge PR #996 (Max: public packet_t) into kbara-next

cbed511

kbara changed the title ~~Max's async Snabb changes~~ Max's async Snabb changes + public packet_t Aug 23, 2016

lukego self-assigned this Aug 25, 2016

lukego merged commit cbed511 into next Aug 26, 2016

lukego added a commit that referenced this pull request Aug 26, 2016

Merge #997 branch 'snabbco/kbara-next' into next

17cf9c6

lukego mentioned this pull request Aug 29, 2016

core.packet: make packet_t public accoding to documentation #996

Merged

lukego added a commit that referenced this pull request Aug 29, 2016

Merge #996 branch 'eugeneia/public-packet_t' into next

dccd79f

Grabbing revert of the packet_t type since we decided to go with 'struct packet' directly instead. See discussion at #997 (comment)

wingo pushed a commit that referenced this pull request Nov 28, 2017

Merge pull request #997 from dpino/rename-expralarm

4bdfe8c

Rename ExprAlarm to CallbackAlarm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Max's async Snabb changes + public packet_t #997

Max's async Snabb changes + public packet_t #997

kbara commented Aug 23, 2016 •

edited

Loading

lukego commented Aug 24, 2016

kbara commented Aug 24, 2016

lukego commented Aug 25, 2016

kbara commented Aug 25, 2016

eugeneia commented Aug 25, 2016 •

edited

Loading

eugeneia commented Aug 25, 2016

lukego commented Aug 26, 2016

lukego commented Aug 26, 2016

eugeneia commented Aug 26, 2016

lukego commented Aug 26, 2016

lukego commented Aug 26, 2016

kbara commented Aug 27, 2016

lukego commented Aug 29, 2016

kbara commented Aug 30, 2016

Max's async Snabb changes + public packet_t #997

Max's async Snabb changes + public packet_t #997

Conversation

kbara commented Aug 23, 2016 • edited Loading

lukego commented Aug 24, 2016

kbara commented Aug 24, 2016

lukego commented Aug 25, 2016

kbara commented Aug 25, 2016

eugeneia commented Aug 25, 2016 • edited Loading

eugeneia commented Aug 25, 2016

lukego commented Aug 26, 2016

lukego commented Aug 26, 2016

eugeneia commented Aug 26, 2016

lukego commented Aug 26, 2016

lukego commented Aug 26, 2016

kbara commented Aug 27, 2016

lukego commented Aug 29, 2016

kbara commented Aug 30, 2016

kbara commented Aug 23, 2016 •

edited

Loading

eugeneia commented Aug 25, 2016 •

edited

Loading