Handle nodes with iptables FORWARD DROP better #39823

ravishivt · 2017-01-12T20:52:37Z

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version): 1.5.1

Environment:

Cloud provider or hardware configuration: 3x Ubuntu 16.04 nodes running in VMware vSphere
OS (e.g. from /etc/os-release): Ubuntu 16.04
Kernel (e.g. uname -a): 4.4.0-57-generic
Install tools: kubeadm
Others: CNI: Canal (flannel + calico)

What happened:

I originally opened #39658 to make Canal CNI work when the node's FORWARD chain default policy is set to DROP. However, I realized that the DROP policy was the root cause for why my other CNI tests also were failing. When either weave and flannel CNI is used and kube-proxy has --masquerade-all enabled, pod to pod and pod to service communication fails. Specifically, kubernetes-dashboard and tiller-deploy fail to reach the kube-apiserver and enter a crash loop. Also, the kube-dns pod also crashes.

kubectl get pods --all-namespaces -o wide

deploy@ravi-kube196:~$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                    READY     STATUS             RESTARTS   AGE       IP               NODE
kube-system   dummy-2088944543-flctq                  1/1       Running            0          23m       10.163.148.197   ravi-kube197
kube-system   etcd-ravi-kube196                       1/1       Running            1          23m       10.163.148.196   ravi-kube196
kube-system   kube-apiserver-ravi-kube196             1/1       Running            1          23m       10.163.148.196   ravi-kube196
kube-system   kube-controller-manager-ravi-kube196    1/1       Running            1          23m       10.163.148.196   ravi-kube196
kube-system   kube-discovery-1769846148-qdzsv         1/1       Running            0          23m       10.163.148.196   ravi-kube196
kube-system   kube-dns-2924299975-gz34l               3/4       Running            18         23m       10.96.2.7        ravi-kube197
kube-system   kube-flannel-ds-2f9tc                   2/2       Running            0          23m       10.163.148.196   ravi-kube196
kube-system   kube-flannel-ds-mgkk9                   2/2       Running            0          23m       10.163.148.198   ravi-kube198
kube-system   kube-flannel-ds-s8ktw                   2/2       Running            0          23m       10.163.148.197   ravi-kube197
kube-system   kube-proxy-3gxs7                        1/1       Running            0          23m       10.163.148.196   ravi-kube196
kube-system   kube-proxy-ftdp9                        1/1       Running            0          23m       10.163.148.198   ravi-kube198
kube-system   kube-proxy-rc0wv                        1/1       Running            0          23m       10.163.148.197   ravi-kube197
kube-system   kube-scheduler-ravi-kube196             1/1       Running            1          23m       10.163.148.196   ravi-kube196
kube-system   kubernetes-dashboard-3203831700-5vzdq   0/1       CrashLoopBackOff   8          23m       10.96.2.8        ravi-kube197
kube-system   tiller-deploy-2885612843-jm143          0/1       CrashLoopBackOff   13         23m       10.96.1.12       ravi-kube198

kubectl logs --namespace=kube-system kubernetes-dashboard-3203831700-5vzdq

deploy@ravi-kube196:~$ kubectl logs --namespace=kube-system kubernetes-dashboard-3203831700-5vzdq
Using HTTP port: 9090
Creating API server client for https://10.96.0.1:443
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout
Refer to the troubleshooting guide for more information: https://github.com/kubernetes/dashboard/blob/master/docs/user-guide/troubleshooting.md

kubectl describe --namespace=kube-system pod kubernetes-dashboard-3203831700-5vzdq

deploy@ravi-kube196:~$ kubectl describe --namespace=kube-system pod kubernetes-dashboard-3203831700-5vzdq
Name:           kubernetes-dashboard-3203831700-5vzdq
Namespace:      kube-system
Node:           ravi-kube197/10.163.148.197
Start Time:     Thu, 12 Jan 2017 20:18:59 +0000
Labels:         app=kubernetes-dashboard
                pod-template-hash=3203831700
Status:         Running
IP:             10.96.2.8
Controllers:    ReplicaSet/kubernetes-dashboard-3203831700
Containers:
  kubernetes-dashboard:
    Container ID:       docker://5718b319fe4af33b97115a14cc17083bab078d868556cb4e2c6f93363da7a462
    Image:              gcr.io/google_containers/kubernetes-dashboard-amd64:v1.5.1
    Image ID:           docker-pullable://gcr.io/google_containers/kubernetes-dashboard-amd64@sha256:46a09eb9c611e625e7de3fcf325cf78e629d002e57dc80348e9b0638338206b5
    Port:               9090/TCP
    State:              Running
      Started:          Thu, 12 Jan 2017 20:44:34 +0000
    Last State:         Terminated
      Reason:           Error
      Exit Code:        1
      Started:          Thu, 12 Jan 2017 20:38:52 +0000
      Finished:         Thu, 12 Jan 2017 20:39:22 +0000
    Ready:              True
    Restart Count:      9
    Liveness:           http-get http://:9090/ delay=30s timeout=30s period=10s #success=1 #failure=3
    Volume Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zfzvp (ro)
    Environment Variables:      <none>
Conditions:
  Type          Status
  Initialized   True
  Ready         True
  PodScheduled  True
Volumes:
  default-token-zfzvp:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-zfzvp
QoS Class:      BestEffort
Tolerations:    dedicated=master:Equal:NoSchedule
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath                           Type            Reason          Message
  ---------     --------        -----   ----                    -------------                           --------        ------          -------
  25m           25m             1       {default-scheduler }                                            Normal          Scheduled       Successfully assigned kubernetes-dashboard-3203831700-5vzdq to ravi-kube197
  25m           25m             1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal          Created         Created container with docker id 8cee6d8c1fcd; Security:[seccomp=unconfined]
  25m           25m             1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal          Started         Started container with docker id 8cee6d8c1fcd
  25m           25m             1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal          Started         Started container with docker id 1a9b25a74c94
  25m           25m             1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal          Created         Created container with docker id 1a9b25a74c94; Security:[seccomp=unconfined]
  24m           24m             2       {kubelet ravi-kube197}                                          Warning         FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 10s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-3203831700-5vzdq_kube-system(5a050ca2-d904-11e6-bec0-0050568a2a2f)"

  24m   24m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Started         Started container with docker id 2da001604ef9
  24m   24m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Created         Created container with docker id 2da001604ef9; Security:[seccomp=unconfined]
  23m   23m     2       {kubelet ravi-kube197}                                          Warning FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 20s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-3203831700-5vzdq_kube-system(5a050ca2-d904-11e6-bec0-0050568a2a2f)"

  23m   23m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Started         Started container with docker id 257a758141b0
  23m   23m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Created         Created container with docker id 257a758141b0; Security:[seccomp=unconfined]
  23m   22m     4       {kubelet ravi-kube197}                                          Warning FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 40s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-3203831700-5vzdq_kube-system(5a050ca2-d904-11e6-bec0-0050568a2a2f)"

  22m   22m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Started         Started container with docker id 3eb69d8169d6
  22m   22m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Created         Created container with docker id 3eb69d8169d6; Security:[seccomp=unconfined]
  21m   20m     7       {kubelet ravi-kube197}                                          Warning FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-3203831700-5vzdq_kube-system(5a050ca2-d904-11e6-bec0-0050568a2a2f)"

  20m   20m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Created         Created container with docker id 6244b99877be; Security:[seccomp=unconfined]
  20m   20m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Started         Started container with docker id 6244b99877be
  20m   20m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Warning Unhealthy       Liveness probe failed: Get http://10.96.2.8:9090/: dial tcp 10.96.2.8:9090: getsockopt: connection refused
  19m   17m     13      {kubelet ravi-kube197}                                          Warning FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-3203831700-5vzdq_kube-system(5a050ca2-d904-11e6-bec0-0050568a2a2f)"

  17m   17m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Created         Created container with docker id 39a9bf9798a8; Security:[seccomp=unconfined]
  17m   17m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Started         Started container with docker id 39a9bf9798a8
  11m   11m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Created         Created container with docker id 24a59311513f; Security:[seccomp=unconfined]
  11m   11m     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Started         Started container with docker id 24a59311513f
  5m    5m      1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Started         Started container with docker id b2970a8b9df9
  5m    5m      1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Created         Created container with docker id b2970a8b9df9; Security:[seccomp=unconfined]
  24m   29s     98      {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Warning BackOff         Back-off restarting failed docker container
  16m   29s     70      {kubelet ravi-kube197}                                          Warning FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-3203831700-5vzdq_kube-system(5a050ca2-d904-11e6-bec0-0050568a2a2f)"

  25m   15s     10      {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Pulling pulling image "gcr.io/google_containers/kubernetes-dashboard-amd64:v1.5.1"
  25m   15s     10      {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Pulled  Successfully pulled image "gcr.io/google_containers/kubernetes-dashboard-amd64:v1.5.1"
  15s   15s     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Created (events with common reason combined)
  15s   15s     1       {kubelet ravi-kube197}  spec.containers{kubernetes-dashboard}   Normal  Started (events with common reason combined)

kubectl describe --namespace=kube-system pod kube-dns-2924299975-gz34l

deploy@ravi-kube196:~$ kubectl describe --namespace=kube-system pod kube-dns-2924299975-gz34l
Name:           kube-dns-2924299975-gz34l
Namespace:      kube-system
Node:           ravi-kube197/10.163.148.197
Start Time:     Thu, 12 Jan 2017 20:18:56 +0000
Labels:         component=kube-dns
                k8s-app=kube-dns
                kubernetes.io/cluster-service=true
                name=kube-dns
                pod-template-hash=2924299975
                tier=node
Status:         Running
IP:             10.96.2.7
Controllers:    ReplicaSet/kube-dns-2924299975
Containers:
  kube-dns:
    Container ID:       docker://1827435a310c7b343c28be9f46e69155bc1da45bfa97376da1b39436b9580a58
    Image:              gcr.io/google_containers/kubedns-amd64:1.9
    Image ID:           docker-pullable://gcr.io/google_containers/kubedns-amd64@sha256:3d3d67f519300af646e00adcf860b2f380d35ed4364e550d74002dadace20ead
    Ports:              10053/UDP, 10053/TCP, 10055/TCP
    Args:
      --domain=cluster.local
      --dns-port=10053
      --config-map=kube-dns
      --v=2
    Limits:
      memory:   170Mi
    Requests:
      cpu:              100m
      memory:           70Mi
    State:              Waiting
      Reason:           CrashLoopBackOff
    Last State:         Terminated
      Reason:           Error
      Exit Code:        137
      Started:          Thu, 12 Jan 2017 20:41:46 +0000
      Finished:         Thu, 12 Jan 2017 20:43:16 +0000
    Ready:              False
    Restart Count:      9
    Liveness:           http-get http://:8080/healthz-kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:          http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
    Volume Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zfzvp (ro)
    Environment Variables:
      PROMETHEUS_PORT:  10055
  dnsmasq:
    Container ID:       docker://c4cbdc7d245b412046dd7f71df3f9b78402d1d1d8bb686461e1d79a2fa9d622c
    Image:              gcr.io/google_containers/kube-dnsmasq-amd64:1.4
    Image ID:           docker-pullable://gcr.io/google_containers/kube-dnsmasq-amd64@sha256:a722df15c0cf87779aad8ba2468cf072dd208cb5d7cfcaedd90e66b3da9ea9d2
    Ports:              53/UDP, 53/TCP
    Args:
      --cache-size=1000
      --no-resolv
      --server=127.0.0.1#10053
      --log-facility=-
    Requests:
      cpu:              150m
      memory:           10Mi
    State:              Waiting
      Reason:           CrashLoopBackOff
    Last State:         Terminated
      Reason:           Error
      Exit Code:        137
      Started:          Thu, 12 Jan 2017 20:42:26 +0000
      Finished:         Thu, 12 Jan 2017 20:43:56 +0000
    Ready:              False
    Restart Count:      9
    Liveness:           http-get http://:8080/healthz-dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
    Volume Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zfzvp (ro)
    Environment Variables:      <none>
  dnsmasq-metrics:
    Container ID:       docker://6d40d421a883ed08b08c90db87a7886edd5f1ecbd8fad555cee698392aeca74e
    Image:              gcr.io/google_containers/dnsmasq-metrics-amd64:1.0
    Image ID:           docker-pullable://gcr.io/google_containers/dnsmasq-metrics-amd64@sha256:4063e37fd9b2fd91b7cc5392ed32b30b9c8162c4c7ad2787624306fc133e80a9
    Port:               10054/TCP
    Args:
      --v=2
      --logtostderr
    Requests:
      memory:           10Mi
    State:              Running
      Started:          Thu, 12 Jan 2017 20:18:58 +0000
    Ready:              True
    Restart Count:      0
    Liveness:           http-get http://:10054/metrics delay=60s timeout=5s period=10s #success=1 #failure=5
    Volume Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zfzvp (ro)
    Environment Variables:      <none>
  healthz:
    Container ID:       docker://6be2f1379179a74fdaa823bfef08c46d57b32a04fa4e12afba9b8b53715a4008
    Image:              gcr.io/google_containers/exechealthz-amd64:1.2
    Image ID:           docker-pullable://gcr.io/google_containers/exechealthz-amd64@sha256:503e158c3f65ed7399f54010571c7c977ade7fe59010695f48d9650d83488c0a
    Port:               8080/TCP
    Args:
      --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
      --url=/healthz-dnsmasq
      --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null
      --url=/healthz-kubedns
      --port=8080
      --quiet
    Limits:
      memory:   50Mi
    Requests:
      cpu:              10m
      memory:           50Mi
    State:              Running
      Started:          Thu, 12 Jan 2017 20:18:58 +0000
    Ready:              True
    Restart Count:      0
    Volume Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zfzvp (ro)
    Environment Variables:      <none>
Conditions:
  Type          Status
  Initialized   True
  Ready         False
  PodScheduled  True
Volumes:
  default-token-zfzvp:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-zfzvp
QoS Class:      Burstable
Tolerations:    dedicated=master:NoSchedule
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath                           Type            Reason          Message
  ---------     --------        -----   ----                    -------------                           --------        ------          -------
  27m           27m             1       {default-scheduler }                                            Normal          Scheduled       Successfully assigned kube-dns-2924299975-gz34l to ravi-kube197
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Started         Started container with docker id af0b015ac580
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Created         Created container with docker id af0b015ac580; Security:[seccomp=unconfined]
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Created         Created container with docker id 8ef0eccb7a5f; Security:[seccomp=unconfined]
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Started         Started container with docker id 8ef0eccb7a5f
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq-metrics}        Normal          Pulled          Container image "gcr.io/google_containers/dnsmasq-metrics-amd64:1.0" already present on machine
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq-metrics}        Normal          Created         Created container with docker id 6d40d421a883; Security:[seccomp=unconfined]
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{healthz}                Normal          Started         Started container with docker id 6be2f1379179
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{healthz}                Normal          Created         Created container with docker id 6be2f1379179; Security:[seccomp=unconfined]
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq-metrics}        Normal          Started         Started container with docker id 6d40d421a883
  27m           27m             1       {kubelet ravi-kube197}  spec.containers{healthz}                Normal          Pulled          Container image "gcr.io/google_containers/exechealthz-amd64:1.2" already present on machine
  25m           25m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Killing         Killing container with docker id af0b015ac580: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "dnsmasq" is unhealthy, it will be killed and re-created.
  24m           24m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Created         Created container with docker id 4fa958a4757b; Security:[seccomp=unconfined]
  24m           24m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Killing         Killing container with docker id 8ef0eccb7a5f: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "kube-dns" is unhealthy, it will be killed and re-created.
  24m           24m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Created         Created container with docker id bd28687cc3a3; Security:[seccomp=unconfined]
  24m           24m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Started         Started container with docker id bd28687cc3a3
  24m           24m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Started         Started container with docker id 4fa958a4757b
  23m           23m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Killing         Killing container with docker id 4fa958a4757b: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "kube-dns" is unhealthy, it will be killed and re-created.
  23m           23m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Created         Created container with docker id b6692369e2e9; Security:[seccomp=unconfined]
  23m           23m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Started         Started container with docker id b6692369e2e9
  22m           22m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Started         Started container with docker id dbc73eb48e9a
  22m           22m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Killing         Killing container with docker id bd28687cc3a3: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "dnsmasq" is unhealthy, it will be killed and re-created.
  22m           22m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Created         Created container with docker id dbc73eb48e9a; Security:[seccomp=unconfined]
  21m           21m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Killing         Killing container with docker id b6692369e2e9: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "kube-dns" is unhealthy, it will be killed and re-created.
  21m           21m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Created         Created container with docker id d961e82157ba; Security:[seccomp=unconfined]
  21m           21m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Started         Started container with docker id d961e82157ba
  21m           21m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Killing         Killing container with docker id dbc73eb48e9a: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "dnsmasq" is unhealthy, it will be killed and re-created.
  20m           20m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Killing         Killing container with docker id d961e82157ba: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "kube-dns" is unhealthy, it will be killed and re-created.
  19m           19m             1       {kubelet ravi-kube197}  spec.containers{dnsmasq}                Normal          Killing         Killing container with docker id 8ee9a9b16e2e: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "dnsmasq" is unhealthy, it will be killed and re-created.
  18m           18m             1       {kubelet ravi-kube197}  spec.containers{kube-dns}               Normal          Killing         Killing container with docker id d035ec298418: pod "kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)" container "kube-dns" is unhealthy, it will be killed and re-created.
  15m           15m             2       {kubelet ravi-kube197}                                          Warning         FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "kube-dns" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kube-dns pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"

  15m   15m     2       {kubelet ravi-kube197}          Warning FailedSync      Error syncing pod, skipping: [failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=dnsmasq pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"
, failed to "StartContainer" for "kube-dns" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kube-dns pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"
]
  15m   13m     9       {kubelet ravi-kube197}          Warning FailedSync      Error syncing pod, skipping: [failed to "StartContainer" for "kube-dns" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kube-dns pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"
, failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=dnsmasq pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"
]
  13m   12m     4       {kubelet ravi-kube197}          Warning FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=dnsmasq pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"

  6m    5m      4       {kubelet ravi-kube197}          Warning FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"

  27m   4m      10      {kubelet ravi-kube197}  spec.containers{kube-dns}       Normal  Pulled          Container image "gcr.io/google_containers/kubedns-amd64:1.9" already present on machine
  27m   4m      10      {kubelet ravi-kube197}  spec.containers{dnsmasq}        Normal  Pulled          Container image "gcr.io/google_containers/kube-dnsmasq-amd64:1.4" already present on machine
  21m   4m      13      {kubelet ravi-kube197}  spec.containers{dnsmasq}        Normal  Created         (events with common reason combined)
  21m   4m      13      {kubelet ravi-kube197}  spec.containers{dnsmasq}        Normal  Started         (events with common reason combined)
  27m   3m      100     {kubelet ravi-kube197}  spec.containers{kube-dns}       Warning Unhealthy       Readiness probe failed: Get http://10.96.2.7:8081/readiness: dial tcp 10.96.2.7:8081: getsockopt: connection refused
  11m   3m      4       {kubelet ravi-kube197}                                  Warning FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "kube-dns" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-dns pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"

  26m   3m      28      {kubelet ravi-kube197}  spec.containers{dnsmasq}        Warning Unhealthy       Liveness probe failed: HTTP probe failed with statuscode: 503
  18m   2m      11      {kubelet ravi-kube197}  spec.containers{dnsmasq}        Normal  Killing         (events with common reason combined)
  10m   54s     11      {kubelet ravi-kube197}                                  Warning FailedSync      Error syncing pod, skipping: [failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"
, failed to "StartContainer" for "kube-dns" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-dns pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"
]
  10m   11s     24      {kubelet ravi-kube197}          Warning FailedSync      Error syncing pod, skipping: [failed to "StartContainer" for "kube-dns" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-dns pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"
, failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-2924299975-gz34l_kube-system(582a3424-d904-11e6-bec0-0050568a2a2f)"
]
  15m   11s     106     {kubelet ravi-kube197}  spec.containers{kube-dns}       Warning BackOff Back-off restarting failed docker container

sudo iptables -t filter -v -L FORWARD --line-numbers (197 currently has the dashboard pod scheduled. Note the 2 dropped packets.)

deploy@ravi-kube197:~$ sudo iptables -t filter -v -L FORWARD --line-numbers
Chain FORWARD (policy DROP 2 packets, 164 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1     1007 67570 DOCKER-ISOLATION  all  --  any    any     anywhere             anywhere
2        0     0 DOCKER     all  --  any    docker0  anywhere             anywhere
3        0     0 ACCEPT     all  --  any    docker0  anywhere             anywhere             ctstate RELATED,ESTABLISHED
4        0     0 ACCEPT     all  --  docker0 !docker0  anywhere             anywhere
5        0     0 ACCEPT     all  --  docker0 docker0  anywhere             anywhere
6     1007 67570 ufw-before-logging-forward  all  --  any    any     anywhere             anywhere
7     1007 67570 ufw-before-forward  all  --  any    any     anywhere             anywhere
8     1007 67570 ufw-after-forward  all  --  any    any     anywhere             anywhere
9     1007 67570 ufw-after-logging-forward  all  --  any    any     anywhere             anywhere
10    1007 67570 ufw-reject-forward  all  --  any    any     anywhere             anywhere
11    1007 67570 ufw-track-forward  all  --  any    any     anywhere             anywhere

What you expected to happen:

Either it should be better documented that the default FORWARD chain policy should be ACCEPT or rules should be added to prevent packets from hitting the default policy. As a note, Ubuntu 16.04 defaults the FORWARD chain to DROP. I assume the --masquerade-all is why the packets are going through the FORWARD chain in the first place but without --masquerade-all, the dashboard still can not access the kube-apiserver.

One option that I suggested in #39658 is to have kube-proxy add an iptables FORWARD rule to ACCEPT packets that have been marked by KUBE-MARK-MASQ.

How to reproduce it (as minimally and precisely as possible):

Create a cluster with:

Weave or Flannel CNI enabled
iptables -P FORWARD DROP on all nodes
kube-proxy set to --masquerade-all
kubernetes-dashboard enabled. kubernetes-dashboard will crash often.

Anything else do we need to know:

#39658 is closed because I thought the issue was specific to Canal and instead opened projectcalico/canal#31. However, this looks to be a larger issue that kubernetes itself should address.

The text was updated successfully, but these errors were encountered:

bboreham · 2017-01-31T17:07:33Z

Weave issue: weaveworks/weave#2758

jbeda · 2017-02-02T01:40:55Z

I think this is going to hit a lot of people as Docker 1.13 is now setting the FORWARD chain to drop by default if ip_forward is set. See moby/moby#28257

jbeda · 2017-02-02T01:55:05Z

See also #40182

thockin · 2017-05-31T16:24:27Z

overlap with #35069 ?

tmjd · 2017-09-08T17:35:35Z

I think a solution for this issue or #39658 (which is closer to what I feel like a proper solution is) should be considered for inclusion for K8s 1.8 since it will officially support docker 1.13.

The problem as I see it is that kube-proxy sets up rules to forward traffic for services (specifically I have seen the problem with NodePorts) yet does not ensure or have any requirement (to my knowledge) that the forwarded traffic be allowed when the traffic is forwarded to a host other than the one receiving the traffic. I observed this when using K8s v1.7.5 installed with kubeadm v1.7.5 and Calico v2.5.

It can be remedied with the addition of iptables -P FORWARD ACCEPT but hopefully there is a better answer than doing that manually and hopefully something more specific to what is needed.

I saw the comment on the Docker 1.13 validation #42926 issue #42926 (comment) linking to the configuration script used in the GCE docker images which is why the e2es are not having any issues.

thockin · 2017-09-08T18:23:26Z

If we can be more precise about which traffic we require to ALLOW, I'm OK with adding rules to open that.

…

On Fri, Sep 8, 2017 at 10:36 AM, Erik Stidham ***@***.***> wrote: I think a solution for this issue or #39658 <#39658> (which is closer to what I feel like a proper solution is) should be considered for inclusion for K8s 1.8 since it will officially support docker 1.13. The problem as I see it is that kube-proxy sets up rules to forward traffic for services (specifically I have seen the problem with NodePorts) yet does not ensure or have any requirement (to my knowledge) that the forwarded traffic be allowed when the traffic is forwarded to a host other than the one receiving the traffic. I observed this when using K8s v1.7.5 installed with kubeadm v1.7.5 and Calico v2.5. It can be remedied with the addition of iptables -A FORWARD ACCEPT but hopefully there is a better answer than doing that manually and hopefully something more specific to what is needed. I saw the comment on the Docker 1.13 validation #42926 <#42926> issue #42926 (comment) <#42926 (comment)> linking to the configuration script used in the GCE docker images which is why the e2es are not having any issues. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#39823 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFVgVFhtymhmvzMxLxDU7fXanM__aaDjks5sgXsNgaJpZM4LiOC4> .

tmjd · 2017-09-13T20:03:03Z

I've been doing some testing with the first two options and chatted with @fasaxc and @caseydavenport about these potential options for adding some forwarding rules. They all are a certainly lot more specific than the previous blanket all ALLOW.

Forward any traffic sourced or destined for the clusterCIDR.

iptables -A FORWARD -s <clusterCIDR> -j ACCEPT
iptables -A FORWARD -d <clusterCIDR> -j ACCEPT

Forward traffic that has the KUBE-MARK-MASQ set and conntrack rules to accept the follow on traffic.

iptables -A FORWARD -m mark --mark <KUBE-MARK-MASQ> -j ACCEPT
iptables -A FORWARD -s <clusterCIDR> -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -d <clusterCIDR> -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

Possibly a third option of adding a more specific mark that is only set on specific traffic that it is known will need to be forwarded and match on that (this option would probably need the conntrack rules like the above option too)

@thockin, do any of those options appeal to you or do you see any problems with them?

fasaxc · 2017-09-14T15:55:57Z

@tmjd If we add some more rules, we shouldn't put them in the top-level chain since it makes them hard to manage/clean up. Best to add them to a chain that kube-proxy owns that is jumped to from the top-level chain.

bboreham · 2017-09-14T16:39:11Z

@tmjd kube-proxy doesn't necessarily know the ClusterCIDR. I could see your proposal making sense in conjunction with #46508.

tmjd · 2017-09-14T17:58:47Z

Very good point @bboreham, I should have mentioned that the options I presented are assuming clusterCIDR is set. If clusterCIDR is not defined then the first option is not possible (except for the wide open option that these rules are attempting to avoid). I guess on options 2 & 3 if it was not set then a conntrack rule could be added without the clusterCIDR. I would appreciate feedback on the desired behavior if clusterCIDR is not set too.

@thockin

Automatic merge from submit-queue (batch tested with PRs 55009, 55532, 55601, 52569, 55533). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Kube-proxy adds forward rules to ensure NodePorts work **What this PR does / why we need it**: Updates kube-proxy to set up proper forwarding so that NodePorts work with docker 1.13 without depending on iptables FORWARD being changed manually/externally. **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #39823 **Special notes for your reviewer**: @thockin I used option number 2 that I mentioned in the #39823 issue, please let me know what you think about this change. If you are happy with the change then I can try to add tests but may need a little direction about what and where to add them. **Release note**: ```release-note Add iptables rules to allow Pod traffic even when default iptables policy is to reject. ```

@tmjd

…9-upstream-release-1.8 Automatic merge from submit-queue. Automated cherry pick of #52569 upstream release 1.8 ```release-note Kube-proxy adds forward rules to ensure NodePorts work ``` Backport of #52569 @tmjd xref #39823

mxey · 2018-01-25T15:51:16Z

The PR that was merged solved the issue for pods inside clusterCIDR. But it doesn't handle hostNetwork pods that use the node IP, like when you want to expose the API server as a load-balanced service externally.

I'm not sure why the “kubernetes forwarding conntrack pod source rule” has to be limited to the pod CIDR, since --ctstate RELATED,ESTABLISHED should already ensure it does not allow traffic that wasn't already allowed by another rule.

euank · 2018-01-26T02:45:29Z

I'd argue that any iptables/firewalling for a pod using hostNetwork isn't the responsibility of Kubernetes, @mxey.

There's no reasonable way for k8s to actually know what interfaces a host-network container binds to, what ports it listens on, or what expectations it has. The cluster operator should probably be responsible for that networking case.

mxey · 2018-01-26T08:46:30Z

I meant explicitly regarding services where those pods are the endpoints of a service, so Kubernetes does know the address and the port.

skbly7 · 2019-01-17T21:18:40Z

Due to limitation of state RELATED,ESTABLISHED in FORWARD rule i.e. current implementation, the unidirectional UDP traffic apps won't work. Because packets (except first) aren't ESTABILISHED from iptables's perspective (no reply from endpoint, no handshake).

Shouldn't this be something like -m state ! --state NEW -j ACCEPT and not -m state --state RELATED,ESTABLISHED -j ACCEPT

calder · 2019-01-20T15:33:47Z

To add to @skbly7's comment, bidirectional UDP apps are brittle as well. Dropping or failing to reply to the first packet will prevent the app from ever receiving a second.

More discussion on projectcalico/calico#2113.

ravishivt mentioned this issue Jan 12, 2017

[Error] Getsockopt: connection refused - Kubernetes apiserver kubernetes-retired/contrib#2249

Closed

bboreham mentioned this issue Jan 31, 2017

Published k8s service is not available #40349

Closed

ravishivt mentioned this issue Jan 31, 2017

Pod to external traffic is not masqueraded #40761

Closed

pprishchepa mentioned this issue Feb 27, 2017

[Concourse + Kubernetes] Failed to push into private repo on Docker Hub concourse/docker-image-resource#102

Closed

dcbw added area/kube-proxy sig/network Categorizes an issue or PR as relevant to SIG Network. labels Mar 20, 2017

thockin added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed sig/network Categorizes an issue or PR as relevant to SIG Network. labels May 16, 2017

tmjd mentioned this issue Sep 15, 2017

Kube-proxy adds forward rules to ensure NodePorts work #52569

Merged

k8s-github-robot closed this as completed in #52569 Nov 14, 2017

thockin mentioned this issue Nov 29, 2017

Automated cherry pick of #52569 upstream release 1.8 #56524

Merged

cgilmour mentioned this issue Dec 1, 2017

New AMI doesn't have forward enabled kubernetes/kops#3958

Closed

wittlesouth mentioned this issue Jan 27, 2018

NodePort only responding on node where pod is running #58908

Closed

fasaxc mentioned this issue Feb 9, 2018

IPVS proxier doesn't handle nodes with iptables policy FORWARD DROP #59656

Closed

tmjd mentioned this issue Aug 16, 2018

k8s nodePort Service can't be reached externally projectcalico/canal#31

Closed

skbly7 mentioned this issue Jan 17, 2019

unidirectional UDP traffic to a NodePort service is not handled properly #72511

Closed

skbly7 mentioned this issue Jan 17, 2019

Listen-only UDP services drop all but first incoming datagram from a given source (address, port) projectcalico/calico#2113

Closed

murali-reddy mentioned this issue Mar 23, 2019

Change outbound nodeport rule to accept RELATED and ESTABLISHED traffic cloudnativelabs/kube-router#695

Closed

sr229 mentioned this issue Apr 20, 2019

NodePort not accepting traffic from forwarded nginx traffic k3s-io/k3s#377

Closed

Bamfax mentioned this issue May 27, 2020

Are GPU-enabled container runnable with containerd runtime? canonical/microk8s#1239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle nodes with iptables FORWARD DROP better #39823

Handle nodes with iptables FORWARD DROP better #39823

ravishivt commented Jan 12, 2017 •

edited

Loading

bboreham commented Jan 31, 2017

jbeda commented Feb 2, 2017

jbeda commented Feb 2, 2017

thockin commented May 31, 2017

tmjd commented Sep 8, 2017 •

edited

Loading

thockin commented Sep 8, 2017 via email

tmjd commented Sep 13, 2017

fasaxc commented Sep 14, 2017

bboreham commented Sep 14, 2017

tmjd commented Sep 14, 2017

mxey commented Jan 25, 2018

euank commented Jan 26, 2018

mxey commented Jan 26, 2018

skbly7 commented Jan 17, 2019 •

edited

Loading

calder commented Jan 20, 2019 •

edited

Loading

Handle nodes with iptables FORWARD DROP better #39823

Handle nodes with iptables FORWARD DROP better #39823

Comments

ravishivt commented Jan 12, 2017 • edited Loading

bboreham commented Jan 31, 2017

jbeda commented Feb 2, 2017

jbeda commented Feb 2, 2017

thockin commented May 31, 2017

tmjd commented Sep 8, 2017 • edited Loading

thockin commented Sep 8, 2017 via email

tmjd commented Sep 13, 2017

fasaxc commented Sep 14, 2017

bboreham commented Sep 14, 2017

tmjd commented Sep 14, 2017

mxey commented Jan 25, 2018

euank commented Jan 26, 2018

mxey commented Jan 26, 2018

skbly7 commented Jan 17, 2019 • edited Loading

calder commented Jan 20, 2019 • edited Loading

ravishivt commented Jan 12, 2017 •

edited

Loading

tmjd commented Sep 8, 2017 •

edited

Loading

skbly7 commented Jan 17, 2019 •

edited

Loading

calder commented Jan 20, 2019 •

edited

Loading