-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backend/vxlan: simplify vxlan processing #785
Conversation
@ayaz I plan to review it in the next few days, then it will need some testing before it's merged into master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good, just a couple of comments.
backend/vxlan/vxlan.go
Outdated
// 2) Create a static ARP entry for the remote flannel host IP address (and the VTEP MAC) | ||
// 3) Create an FDB entry with the VTEP MAC and the public IP of the remote flannel daemon. | ||
// | ||
// In this scheme the scaling of table entries in linear to the number of remote hosts - 1 route, 1 arp entry and 1 FDB entry per host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entries is linear
backend/vxlan/vxlan.go
Outdated
return nil, err | ||
// Ensure that the device has a /32 address so that no broadcast routes are created. | ||
// This IP is just used as a source address for host to workload traffic (so | ||
// the return path for the traffic has a decent address to use as the destination) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean by a "decent address" here?
backend/vxlan/vxlan_network.go
Outdated
case subnet.EventAdded: | ||
log.V(2).Infof("Adding subnet: %s PublicIP: %s VtepMAC: %s", event.Lease.Subnet, event.Lease.Attrs.PublicIP, net.HardwareAddr(attrs.VtepMAC)) | ||
if err := nw.dev.AddArp(neighbor{IP: event.Lease.Subnet.IP, MAC: net.HardwareAddr(attrs.VtepMAC)}); err != nil { | ||
log.Error("AddArp failed: ", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder if we should return here when this fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning would just mean that we don't try to process the rest of the events in the batch. I think it's still worth trying to process those events. What's your reasoning for why we shouldn't try?
backend/vxlan/vxlan_network.go
Outdated
if err != nil { | ||
log.Error("Delete L2 failed: ", err) | ||
if err := nw.dev.AddFdb(neighbor{IP: event.Lease.Attrs.PublicIP, MAC: net.HardwareAddr(attrs.VtepMAC)}); err != nil { | ||
log.Error("AddFdb failed: ", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and return here with cleanup from whatever we did in the previous step line 110? wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, I thing maybe you're suggesting a "continue" rather than "return"? And advocating for cleaning up the failed FDB or arp entries? It sounds like a reasonable suggestion to me (though I don't think it really brings much benefit to the user and increases the code complexity).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've addressed this now.
backend/vxlan/device.go
Outdated
func (dev *vxlanDevice) AddL2(n neighbor) error { | ||
log.V(4).Infof("calling NeighAdd: %v, %v", n.IP, n.MAC) | ||
return netlink.NeighAdd(&netlink.Neigh{ | ||
func (dev *vxlanDevice) AddFdb(n neighbor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better name would be AddFDB
since FDB is an acronym
See: https://github.com/golang/go/wiki/CodeReviewComments#initialisms (also applies to a bunch of other method names in this file.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
backend/vxlan/device.go
Outdated
@@ -177,8 +163,8 @@ func (dev *vxlanDevice) AddL2(n neighbor) error { | |||
}) | |||
} | |||
|
|||
func (dev *vxlanDevice) DelL2(n neighbor) error { | |||
log.V(4).Infof("calling NeighDel: %v, %v", n.IP, n.MAC) | |||
func (dev *vxlanDevice) DelFdb(n neighbor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, a better name would be DelFDB
backend/vxlan/device.go
Outdated
@@ -188,77 +174,28 @@ func (dev *vxlanDevice) DelL2(n neighbor) error { | |||
}) | |||
} | |||
|
|||
func (dev *vxlanDevice) AddL3(n neighbor) error { | |||
log.V(4).Infof("calling NeighSet: %v, %v", n.IP, n.MAC) | |||
func (dev *vxlanDevice) AddArp(n neighbor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a better name would be AddARP
backend/vxlan/vxlan_network.go
Outdated
} else { | ||
log.V(2).Infof("L3 miss: AddL3 for %s succeeded", miss.IP) | ||
default: | ||
log.Error("Internal error: unknown event type: ", int(event.Type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why you have some error logs with log.Error
and uppercase error message vs log.Errorf
and a lowercase message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no reason 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update them all to lowercase
@tomdee LGTM! I haven't had a chance to manually test this. I'm assuming you have done some testing? lmk if you want me to do some manual testing before the merge. I will likely not have any time this week tho |
I have done some testing, but I would love for people to do more testing before this gets released. You can try it by using the image |
No need for any netlink monitoring anymore
This means vxlan should work with older kernels (Centos 6)
and it should perform betteron heavily loaded systems.
This change is compatible with older versions of flannel
so it should be OK to do rolling upgrades to it.
Some notes on how vxlan worked before and how this changes it can be found in the code https://github.com/coreos/flannel/pull/785/files#diff-2fcdad83a560522bb17d06e480b6ed76R17
Fixes #779
Fixes #592
Fixes #617
Fixes #654
Fixes #604
Fixes #414
Fixes #427