Requests timeout when externalTrafficPolicy is set to Local #2582

helloagain-dev · 2018-05-30T07:42:38Z

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.):
I am not sure whether this is a bug or if it is configured incorrectly. I already tried to ask on #kubernetes-users slack channel 3 times, but nobody responded.

What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.):
externalTrafficPolicy local timeout

Is this a BUG REPORT or FEATURE REQUEST? (choose one): Bug Report

NGINX Ingress controller version: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.15.0

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2", GitCommit:"bdaeafa71f6c7c04636251031f93464384d54963", GitTreeState:"clean", BuildDate:"2017-10-24T19:48:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9+", GitVersion:"v1.9.7-gke.0", GitCommit:"1883ce4eb0e057cfc2439ebeb9822da0a9d40405", GitTreeState:"clean", BuildDate:"2018-04-19T17:08:34Z", GoVersion:"go1.9.3b4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Cloud provider or hardware configuration: Google Cloud Engine
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

What happened:
We have a GCE setup with 1 Node and a LoadBalancer, as well as several services behind it.
This works fine. We now have to track the IP address of the Client, so according to this I changed the externalTrafficPolicy from Cluster to Local.

The IP address is actually passed correctly, but around 1/3 of the requests are not being fullfilled. They simply timeout.
If I scale the nginx replicas to 3 it happens a lot less, but still occassionally.

This is the config of my load balancer (external IP changed):

{
  "kind": "Service",
  "apiVersion": "v1",
  "metadata": {
    "name": "nginx",
    "namespace": "nginx-ingress",
    "selfLink": "/api/v1/namespaces/nginx-ingress/services/nginx",
    "uid": "414a1a82-fde7-11e6-a779-42010af0015f",
    "resourceVersion": "65561390",
    "creationTimestamp": "2017-02-28T18:53:55Z",
    "annotations": {
      "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"name\":\"nginx\",\"namespace\":\"nginx-ingress\"},\"spec\":{\"externalTrafficPolicy\":\"Cluster\",\"ports\":[{\"name\":\"http\",\"port\":80},{\"name\":\"https\",\"port\":443}],\"selector\":{\"app\":\"nginx\"},\"type\":\"LoadBalancer\"}}\n"
    }
  },
  "spec": {
    "ports": [
      {
        "name": "http",
        "protocol": "TCP",
        "port": 80,
        "targetPort": 80,
        "nodePort": 32387
      },
      {
        "name": "https",
        "protocol": "TCP",
        "port": 443,
        "targetPort": 443,
        "nodePort": 30003
      }
    ],
    "selector": {
      "app": "nginx"
    },
    "clusterIP": "10.95.242.28",
    "type": "LoadBalancer",
    "sessionAffinity": "None",
    "externalTrafficPolicy": "Local",
    "healthCheckNodePort": 32648
  },
  "status": {
    "loadBalancer": {
      "ingress": [
        {
          "ip": "22.17.59.39"
        }
      ]
    }
  }
}

What you expected to happen:
The IP address is passed to the backend as a header and everything is running smoothly.
No requests timeout.

How to reproduce it (as minimally and precisely as possible):
Setup the following:

LoadBalancer with externalTrafficPolicy set to Local
Ingress with HTTPS setup
Service
Pod running a HTTP service

Anything else we need to know:
I did a tcpdump and found that some packets are retried if this happens.
If it helps you I can upload a tcpdump.

The text was updated successfully, but these errors were encountered:

helloagain-dev · 2018-05-30T14:44:21Z

I found out, that I was accidentally running a second node-pool with 1 node. When I deleted it everything started to work normal again.
Why is this the case?

aledbf · 2018-05-30T21:21:10Z

@helloagain-dev please check the cloud controller manager logs. The ingress controller itself does not create or change cloud resources. That is done by Kubernetes.

aledbf closed this as completed May 30, 2018

gboor mentioned this issue May 26, 2019

GKE with externalTrafficPolicy Local causes massive timeouts #4121

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requests timeout when externalTrafficPolicy is set to Local #2582

Requests timeout when externalTrafficPolicy is set to Local #2582

helloagain-dev commented May 30, 2018

helloagain-dev commented May 30, 2018

aledbf commented May 30, 2018

Requests timeout when externalTrafficPolicy is set to Local #2582

Requests timeout when externalTrafficPolicy is set to Local #2582

Comments

helloagain-dev commented May 30, 2018

helloagain-dev commented May 30, 2018

aledbf commented May 30, 2018