-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iptables contention when creating MASQUERADE rule #988
Comments
The log indicates that flanneld is being killed/stopped somehow, which would cause some goroutines return something strange. I do not think it is about iptables. Most likely, your pod was killed during startup
Base on the timestamps, the error message showed up about 30min after flanneld started to shutdown, in my experience this could indicates resource shortage on your system. (e.g. fork bomb) |
There are no error messages in syslog indicating a resource shortage and everything else in the system seems to be functional. The flannel daemonset pods are still running as expected. Once that error shows up, flannel never recovers without bouncing the pods. We have some other code that is managing iptables rules and suspect both it and flannel are trying to make changes at the same time (hence, why I pointed to #935 as a workaround). |
@dtshepherd sorry about my unthoughtful statements above. I run flannel via systemd, and the log "Waiting for all goroutines to exit" would not be printed out until flanneld break out its loop, and if flanneld is killed right after it started, I always notice error logs about 'iptables'. I rushed to a conclusion that what I experienced is similar to yours. But after checking with code, it turned out that when --kube-subnet-mgr is used, the code for lease monitoring loop is skipped and that line of log would always printed out.... Checking with flannel code, it seems that it would retry every "iptables-resync" seconds, defult is 5s
Sorry again... Just tried to help.... |
No problem! I thought it should retry as well, but the pod becomes hung and doesn't do anything. I haven't had a chance to dig into the code as to why it isn't working. |
We saw similar problems with what I wouldn't consider a large cluster yet, but one that had fairly frequent pod activity changing the service ip landscape. In our investigation, we found there were a few core kubernetes services for our cluster that had bad pod mount configuration, kube-proxy and calico, and a general incompatibility between the host iptables command and the one inside the flannel container. We run Container Linux as our host OS. In the most recent version, 1967.3.0, the host iptables is very outdated. It has version 1.4.21 released in 2013 with the first iteration of locking that used unix domain sockets 1. Later in 2015 this locking mechanism changed to the current flock using Our main issue came down to missing mounts for |
@cehoffman Wow good catch, we are using |
I'm pretty sure the YAML for installing flannel needs to be updated to include the It appears kube-proxy has the mount so it should handle any iptables contention with the host, but the flannel DaemonSet fails to mount the same file. |
This prevents iptables contention with kube-proxy and the host OS. Fixes flannel-io#988.
This prevents iptables contention with kube-proxy and the host OS. Fixes flannel-io#988.
Random flannel pods keep locking up when trying to ensure the MASQUERADE iptables rule is in place.
Expected Behavior
Flannel should retry if adding the iptables rule fails or iptables returns an error code.
Current Behavior
Flannel stops working on the node with iptables contention until the pod is forcefully restart...
Possible Solution
Maybe #935 helps workaround the problem, however, flannel doesn't have a release that includes this fix yet. Also, it would be nice if flannel retried with exponential backoff to check/ensure iptables rule is in place. Right now, it seems like once it fails, flannel on that specific kubernetes node won't recover.
Steps to Reproduce (for bugs)
Not easily reproducible as it seems to be a race condition. Maybe create a 2nd process that is also locking/modifying tables fairly often?
Context
Reliably recover flannel network without user intervention. Isolated clusters need to self-recover without manually deleting pods.
Your Environment
The text was updated successfully, but these errors were encountered: