-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws-node CNI unstably curl nodeport #591
Comments
Sorry forgot to introduce the cluster version. |
Attached sysctl settings for k8s worknode. Kubernetes Settingsvm.max_map_count = 262144 kernel.softlockup_panic = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 Increase the number of connectionsnet.core.somaxconn = 32768 Maximum Socket Receive Buffernet.core.rmem_max = 16777216 Default Socket Send Buffernet.core.wmem_max = 16777216 Increase the maximum total buffer-space allocatablenet.ipv4.tcp_wmem = 4096 12582912 16777216 Increase the number of outstanding syn requests allowednet.ipv4.tcp_max_syn_backlog = 8096 For persistent HTTP connectionsnet.ipv4.tcp_slow_start_after_idle = 0 Increase the tcp-time-wait buckets pool size to prevent simple DOS attacksnet.ipv4.tcp_tw_reuse = 1 Max number of packets that can be queued on interface inputIf kernel is receiving packets faster than can be processedthis queue increasesnet.core.netdev_max_backlog = 16384 Increase size of file handles and inode cachefs.file-max = 2097152 Max number of inotify instances and watches for a userSince dockerd runs as a single user, the default instances value of 128 per user is too lowe.g. uses of inotify: nginx ingress controller, kubectl logs -ffs.inotify.max_user_instances = 8192 AWS settingsIssue #23395net.ipv4.neigh.default.gc_thresh1=0 Prevent docker from changing iptables: kubernetes/kubernetes#40182net.ipv4.ip_forward=1 |
I used tcpdump to capture network package. I fount it will re-transmission the package as below. But finally it will timeout. 17 4.310309 172.31.127.10 172.31.73.171 TCP 66 45072 > 30093 [ACK] Seq=84 Ack=240 Win=28032 Len=0 TSval=1179362178 TSecr=411319205 Any idea? Regards, |
Hi @williamyao1982, thanks for reporting the issue. What instance type are you using? Also, could you try a newer version of the CNI? Preferably v1.5.3, but if that shows the same issue, could you try v1.4.1? |
Thanks @mogren. I am using AWS china cloud. The instance type is r5.xlarge. I can't try new version. Because it's a risk for us if it failed. Any idea to fix this kind of issue? |
Is this still an issue? Please try with the latest CNI. Also, this could be related to https://tech.xing.com/a-reason-for-unexplained-connection-timeouts-on-kubernetes-docker-abd041cf7e02 |
@mogren Thanks for your support. We already changed CNI to flannel. |
Dear Team,
We used KOPS to deploy kubernetes cluster in AWS cloud with aws-node CNI. The strange thing is we have two work node, one worknode works alright, another one can't curl the nodeport successfully everytime.
I only disabled selinux manually in both of them and install cloudwatch agent for them.
For example(no response until timeout, but sometime it will get the feedback with 200):
/ # curl http://172.31.73.171:30093
Below is the content of sysctls.out.
[root@ip-172-31-73-171 aws-routed-eni]# less sysctls.out
================== sysctls ==================
/proc/sys/net/ipv4/conf/all/rp_filter = 1
/proc/sys/net/ipv4/conf/default/rp_filter = 1
/proc/sys/net/ipv4/conf/eth0/rp_filter = 1
Regards,
William Yao
The text was updated successfully, but these errors were encountered: