Rolling update puts nodes into "not ready" #4946

recollir · 2018-04-09T10:52:19Z

Kops version 1.8.0
Kubernetes version 1.8.6
AWS (3 masters and 3 nodes)
kops edit followed by kops update and kops rolling-update. kops edit to add configuration flags for the apiserver (dex related). Also tried kops rolling-update --instance-group <master...> to only update one master at a time.
Nodes become "not ready" in an unpredictable way. Sometimes no node is affected. Sometimes one node becomes "not ready" and recovers after a few minutes. Sometimes all nodes are "not ready" for a longer period. Up to 15 minutes. While the masters report ready. During this time the workload on the cluster is not accessible.
Nothing: a non-breaking rolling update without affecting nodes or the workload.
Starting config: https://gist.github.com/recollir/9e9b4b0b426ef77014083f1839c123d6
Added via kops edit before the rolliing-update: https://gist.github.com/recollir/da9fd8a123b58f555f2e4321093e9d46
https://gist.github.com/recollir/5b19d543adaa50b1889aabafeb77b847
A couple of times I observed that after the rolling update the ELB for the API server was missing AZ attached to it.

johanneswuerbach · 2018-04-09T17:27:09Z

In our case manually restarting the kubelet helped, do you have the logs of an affected node?

recollir · 2018-04-09T17:33:33Z

Not from the current test runs (cluster has been deleted and created a couple of times). But I can recreate this, keep the logs and attach them to this issue.

Nevertheless, I would be interested how to prevent this from happening. We need to do some changes to a production cluster where a restart of a kubelet seems rather inappropriate.

johanneswuerbach · 2018-04-09T17:50:30Z

Understandably, we are also started seeing this upgrading to kubernetes 1.8 (1.8.10 currently) and I’m currently debugging what could cause this.

It looks like in our case the kubelet tries to connect to an old API server IP so either its caching the dns resolution somehow too long (TTL should only allow 60s) or the record wasn’t updated correctly.

recollir · 2018-04-09T19:05:54Z

Thanks for the pointing this out. I will see if this is the same for us first thing in the morning. My TZ is CEST.

justinsb · 2018-04-09T20:45:53Z

Thanks for reporting & sorry about the problem.

Was this with a gossip DNS (.k8s.local) or a "real" Route53 DNS name?

johanneswuerbach · 2018-04-09T20:51:57Z

In our case a "real" Route53 record. Also restarting the kubelet almost immediately fixed the issue, while just waiting took up to 15mins for the node to be marked as Ready again.

We are running kops 1.9.0-beta.2.

recollir · 2018-04-09T21:03:50Z

Real Route53 DNS name.

recollir · 2018-04-10T11:12:37Z

@johanneswuerbach seems the same for us - kubelet trying to connect to an old API server IP. Trying to verify this now.

johanneswuerbach · 2018-04-10T11:19:02Z

Could you check whether the internal master DNS contains the IPs of the new masters or is still returning an old one?

johanneswuerbach · 2018-04-10T13:04:52Z

We also hit this on another node again:

kubelet[1270]: E0410 13:01:58.506984    1270 kubelet_node_status.go:390] Error updating node status, will retry: error getting node "ip-xxx.ec2.internal": Get https://api.internal.xxx/api/v1/nodes/ip-xxx.ec2.internal?resourceVersion=0: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
kubelet[1270]: E0410 13:02:08.507374    1270 kubelet_node_status.go:390] Error updating node status, will retry: error getting node "ip-xxx.ec2.internal": Get https://api.internal.xxx/api/v1/nodes/ip-xxx.ec2.internal: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
and eventually
kubelet[1270]: E0410 13:02:38.508388    1270 kubelet_node_status.go:382] Unable to update node status: update node status exceeds retry count

The IP is the IP of the node itself.

recollir · 2018-04-10T13:07:29Z

That is exactly the error msg I see. It starts to appear when the old IP address is removed from the A record for api.internal.xxx and the new IP address for the new master is added. Sometimes after the first master, sometimes after the second master.

lkysow · 2018-04-10T18:36:50Z

It's probably due to: kubernetes/kubernetes#41916 (comment) where the kubelet caches the IP of the old master nodes. That's why a restart fixes it.

sstarcher · 2018-04-10T19:50:23Z

I had the same issue today with 1.9.0-beta-2

lkysow · 2018-04-10T20:52:16Z

I think the best practice is to set up an internal ELB that fronts the masters and have the API url point to that, the same way it's done for the external api. Is that possible with kops right now?

recollir · 2018-04-10T21:14:22Z

I don’t think so. The type load balancer for the api in the spec refers to the client (kubectl) AFAIK. At least in a quick test I still got DNS round robin based entries for the api that the kubelet used.

I think as well an ELB for the kubelet to connect to would be the “right” way to go. At least it is the way kubeadm does HA nowadays (if though manually still). The ELB would detect that a master is gone through the health check, break the connection and force the kubelet to reconnect, wouldn’t it?

What would be needed? How much work would it be? Are there any pointers to start? I wouldn’t mind to give it a try. But would need instructions.

sstarcher · 2018-04-12T16:07:20Z

Just did a rolling update from 1.9.0-beta-2 to 1.9.0 and the same issue all of my nodes go from Ready to not ready.

@chrislovecnm @justinsb have you tried a master rolling update with 1.9.0 by chance on AWS?

The first time I ever noticed this issue was with 1.9.0-beta-2, but all nodes go into Not Ready which takes down every service in the cluster.

sstarcher · 2018-04-12T16:25:34Z

I can confirm it happens for 15mins or until the kubelet is restarted.

lkysow · 2018-04-12T16:31:42Z

15m is what is expected for the issue with kubelet caching IPs: kubernetes/kubernetes#41916 (comment)

recollir · 2018-04-20T15:00:40Z

Just realised that the same problem affects the kube-proxy, btw.

recollir · 2018-04-24T18:55:05Z

Our current hypothesis for a workaround is to create new temp nodes and lock these to only one of the masters by overriding the dns name of the api server in /etc/hosts. Then migrate all the pods to these new temp nodes by draining the old nodes. These will free up two of the master nodes for a rolling update without causing interruption due to the old nodes becoming “not ready”. Once the 2 masters are done the old nodes can be lock to the new 2 masters and the pods moved back to them. Freeing up master 3 for a rolling update. And finally the temp nodes can be deleted. Cumbersome and ugly... but it works.

Nevertheless we should consider doing the LB for the node to master communication as it is also the nowadays recommended way for doing HA with kubeadm, for example.

jaredallard · 2018-04-26T01:42:01Z

Ran into this today with a beta environment deploy -- thankfully nothing broke in our production env, but certainly not a good sign... Any detailed fixes for this? kubelet restart sure, but on each node?

recollir · 2018-04-26T10:36:56Z

Just to add another occasion where this can happen: when "updating" from kops 1.8 to kops 1.9 and performing the required rolling-update. As first all masters are restarted/recreated, the nodes can become not ready if kubelet/kube-proxy was/is talking to the corresponding, restarting master.

zachaller · 2018-05-09T05:52:12Z

We are also being hit by this and its causing our own api's to have downtime when the masters come back up from a termination. I do think that putting an elb on the internal api endpoint would help in this case as well.

mattatcha · 2018-06-25T18:15:43Z

Looks like a fix has been merged and a pr is open to backport to 1.9
fix: kubernetes/kubernetes#63492
1.9 backport: kubernetes/kubernetes#63832

recollir · 2018-06-25T18:56:58Z

Approved and cherry-picked as well.

fejta-bot · 2018-09-23T19:36:12Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-10-23T19:53:06Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2018-11-22T20:40:05Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2018-11-22T20:40:12Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

johanneswuerbach mentioned this issue Apr 10, 2018

Kubelet unable to send status updates(heartbeats) to APIServer kubernetes/kubernetes#61917

Closed

sstarcher mentioned this issue Apr 12, 2018

kubelet fails to heartbeat with API server with stuck TCP connections kubernetes/kubernetes#48638

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 23, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 23, 2018

k8s-ci-robot closed this as completed Nov 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rolling update puts nodes into "not ready" #4946

Rolling update puts nodes into "not ready" #4946

recollir commented Apr 9, 2018

johanneswuerbach commented Apr 9, 2018

recollir commented Apr 9, 2018

johanneswuerbach commented Apr 9, 2018 •

edited

Loading

recollir commented Apr 9, 2018

justinsb commented Apr 9, 2018

johanneswuerbach commented Apr 9, 2018 •

edited

Loading

recollir commented Apr 9, 2018

recollir commented Apr 10, 2018

johanneswuerbach commented Apr 10, 2018

johanneswuerbach commented Apr 10, 2018 •

edited

Loading

recollir commented Apr 10, 2018

lkysow commented Apr 10, 2018

sstarcher commented Apr 10, 2018

lkysow commented Apr 10, 2018

recollir commented Apr 10, 2018

sstarcher commented Apr 12, 2018 •

edited

Loading

sstarcher commented Apr 12, 2018

lkysow commented Apr 12, 2018

recollir commented Apr 20, 2018

recollir commented Apr 24, 2018

jaredallard commented Apr 26, 2018

recollir commented Apr 26, 2018

zachaller commented May 9, 2018 •

edited

Loading

mattatcha commented Jun 25, 2018

recollir commented Jun 25, 2018

fejta-bot commented Sep 23, 2018

fejta-bot commented Oct 23, 2018

fejta-bot commented Nov 22, 2018

k8s-ci-robot commented Nov 22, 2018

Rolling update puts nodes into "not ready" #4946

Rolling update puts nodes into "not ready" #4946

Comments

recollir commented Apr 9, 2018

johanneswuerbach commented Apr 9, 2018

recollir commented Apr 9, 2018

johanneswuerbach commented Apr 9, 2018 • edited Loading

recollir commented Apr 9, 2018

justinsb commented Apr 9, 2018

johanneswuerbach commented Apr 9, 2018 • edited Loading

recollir commented Apr 9, 2018

recollir commented Apr 10, 2018

johanneswuerbach commented Apr 10, 2018

johanneswuerbach commented Apr 10, 2018 • edited Loading

recollir commented Apr 10, 2018

lkysow commented Apr 10, 2018

sstarcher commented Apr 10, 2018

lkysow commented Apr 10, 2018

recollir commented Apr 10, 2018

sstarcher commented Apr 12, 2018 • edited Loading

sstarcher commented Apr 12, 2018

lkysow commented Apr 12, 2018

recollir commented Apr 20, 2018

recollir commented Apr 24, 2018

jaredallard commented Apr 26, 2018

recollir commented Apr 26, 2018

zachaller commented May 9, 2018 • edited Loading

mattatcha commented Jun 25, 2018

recollir commented Jun 25, 2018

fejta-bot commented Sep 23, 2018

fejta-bot commented Oct 23, 2018

fejta-bot commented Nov 22, 2018

k8s-ci-robot commented Nov 22, 2018

johanneswuerbach commented Apr 9, 2018 •

edited

Loading

johanneswuerbach commented Apr 9, 2018 •

edited

Loading

johanneswuerbach commented Apr 10, 2018 •

edited

Loading

sstarcher commented Apr 12, 2018 •

edited

Loading

zachaller commented May 9, 2018 •

edited

Loading