-
Notifications
You must be signed in to change notification settings - Fork 673
Fast datapath #1438
Fast datapath #1438
Conversation
|
||
type ShortIDPeers struct { | ||
// If we know about a single peer with the short id, this is | ||
// that peer. If there is a coliision, this is the peer with |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Regarding the create/delete datapath & add datapath interface commands added to weaver:
We've started (for better or worse) to develop a stable of little helper executables under |
Where there is no functional overlap, I think having a separate helper program is natural. That's not the case here. And we probably do want coverage on those helper programs, too. |
I have a couple of thoughts regarding the short peer ID allocation code:
add/delete then move entries between the free/in-use maps; hopefully this would make the mechanism more 'obviously correct' and obviate the need for the complex unit tests. |
const PeerShortIDBits = 12 | ||
|
||
func randomPeerShortID() PeerShortID { | ||
return PeerShortID(randUint64() & (1<<PeerShortIDBits - 1)) |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Add smoke tests for FDP? |
Such as? The existing smoke tests exercise FDP. What functional additions are there to test? |
Fallback behaviour. |
Already there: |
// less preferred forwarders are no longer needed | ||
fwd.stopFrom(index + 1) | ||
|
||
if !fwd.alreadyEstablished { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
You're going to have to help me out - that test isn't modified in the PR, and I can't see how the current version tests fallback. Here's what I think will happen:
What am I missing? |
The test introduces a iptables rule to discard packets > 700 bytes. fastdp sends a heartbeat packet sized to the overlay MTU (i.e. 1410 bytes). This won't get through, and so fastdp will never report that it is established (and ultimately it will time out and report an error). But sleeve will report that it is established, and so be selected as the active overlay. The test then tests sleeve fragmentation behaviour as before. |
Argh I did check for that but I took the comment
literally and missed the Thanks! |
|
||
// The router needs to know which flows are active in order to | ||
// maintain its MAC->peer table. We do this by quering the router | ||
// without an actual packet being involed. Maybe it's |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Now fixed. |
fwd := op.fastdp.forwarders[op.srcPeer.Name] | ||
op.fastdp.lock.Unlock() | ||
|
||
if fwd != nil || dec.IsSpecial() { |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
There seem to be some quite big coverage holes, e.g.
|
That's not the problem - in fact, only |
Because none of the smoke tests have nearly enough peers to cause short ID collisions.
Because we don't run the smoke tests for long enough for flows to be expired (in the same way we never see MAC cache expiries). At 74% and 78%, the coverage of the two big new files (fastdp.go and overlay_switch.go) is about average. |
Can you think of an easy way to test InvalidateShortIDs? |
No, nothing that isn't a case of Volkswagening. |
And note that:
So it seems reasonable to check the code in InvalidateShortIDs by inspection. |
These are needed to support vxlan encapsulation. In a vxlan packet, the only place we can straightforwardly hold the source and destination peer identifiers used by the weave router is in the 24-bit VNI (vxlan network identifier) field. This gives us 12 bits to indicate a peer. The resulting limit of 4096 peers on a weave network does not seem like a problem in the near future. But it does mean that decentralized allocation makes collisions fairly likely even in weave networks of sizes that are common today. Fortunately, the problem is much more tractable than IPAM because we can detect when collisions occur and take action to resolve them.
It's more robust to signal the conditions asynchronously via channels rather than callbacks.
This means: * The 'weave' netdev is an OVS datapath * The weave router container runs with --net=host in order to access the datapath. * The MTU is set to a smaller value, to allow vxlan encapsulation. Setting the WEAVE_NO_FASTDP environment variable disables these things, effectively running the router without fast datapath. It is an environment variable rather than an option to 'weave launch' because it takes effect in create_bridge, which is called from many subcommands, and it doesn't seem practical to add option processing to all of them.
Previously, the proxy would run the weave script in ways that did not run the weaver executable. Now they can. This is a problem because of the contortions involved in using golang's code coverage support in the integration tests, which require the weaver command line args to be slightly mangled.
Partly for the side effect of restoring coverage of router/ethernet_decoder.go
merge it |
Smoke tests pass (modulo the one that is broken on master), so I think this is fit for a review pass.
Things still to do:
weave
script changes are still a bit grotty. But...--net=host
, and the MTU change.