Node deletion does not clear up the IPs #3372

alok87 · 2018-08-08T20:03:36Z

Node on getting deleted/terminated by autoscaler, or manual termination should clear up the Pod IPs. But it is not happening.

Related - #2797 (comment)

Versions:

kubernetes version: 1.9.8
provisioned using kops: 1.9.2
weave version which kops provisioned: weaveworks/weave-kube:2.3.0
Node AMI is the default debian AMI kops provide(1.9 image): kope.io/k8s-1.9-debian-jessie-amd64-hvm-ebs-2018-03-11
Kernel version:

admin@ip-10-0-21-54:~$ uname -a
Linux ip-10-0-21-54 4.4.78-k8s #1 SMP Fri Jul 28 01:28:39 UTC 2017 x86_64 GNU/Linux

Kubectl version:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.8", GitCommit:"c138b85178156011dc934c2c9f4837476876fb07", GitTreeState:"clean", BuildDate:"2018-05-21T19:01:12Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.6", GitCommit:"4bc5e7f9a6c25dc4c03d4d656f2cefd21540e28c", GitTreeState:"clean", BuildDate:"2017-09-14T06:36:08Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

What you expected to happen?

I expected the dead nodes IPs to get cleared

$ kubectl get nodes
NAME                                             STATUS    ROLES     AGE       VERSION
ip-10-0-20-119.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-20-155.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-20-172.ap-southeast-1.compute.internal   Ready     master    1d        v1.9.8
ip-10-0-20-186.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-20-203.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-20-207.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-20-67.ap-southeast-1.compute.internal    Ready     node      1d        v1.9.8
ip-10-0-21-119.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-21-120.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-21-142.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-21-165.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-21-242.ap-southeast-1.compute.internal   Ready     master    1d        v1.9.8
ip-10-0-21-252.ap-southeast-1.compute.internal   Ready     node      1d        v1.9.8
ip-10-0-21-59.ap-southeast-1.compute.internal    Ready     node      1d        v1.9.8
ip-10-0-40-135.ap-southeast-1.compute.internal   Ready     master    1d        v1.9.8

admin@ip-10-0-21-252:~$ curl -s 'http://127.0.0.1:6784/status/ipam' | grep 'unreachable\!$'
12:26:5c:08:f3:b4(ip-10-0-21-72.ap-southeast-1.compute.internal)   131072 IPs (06.2% of total) - unreachable!

Expected IPs to get cleared for 10.0.21.172 which got deleted(scaled down and terminated).
We have to manually clean the nodes IPs using curl -H "Accept: application/json" -X DELETE 'http://localhost:6784/peer/ip-10-0-21-72.ap-southeast-1.compute.internal

What happened?

The nodes were deleted but still showed the IPs as not cleared

How to reproduce it?

Delete a node
Check the IPs get cleared for it or not

Anything else we need to know?

CloudProvider: aws

@bboreham I would like to contribute to fixing the problem here. Let me know if i can take this up.

The text was updated successfully, but these errors were encountered:

redi-vinogradov · 2018-08-09T13:30:59Z

I'm also seeing this issue with weave 2.4.0, kops 1.9.1, k8s 1.9.9

murali-reddy · 2018-08-09T13:38:43Z

@alok87 if you have a fix in mind please go ahead and raise a PR. Otherwise i am happy to pick this up and get it fixed for next release.

alok87 · 2018-08-09T14:24:19Z

@murali-reddy Cool i can take this up in few days.
One question - what impact this unreachable IPs have in the request being routed in the cluster having this issue where in total 86% in total of nodes IPs have become ubreachable.

murali-reddy · 2018-08-09T15:01:14Z

@alok87 I don't think there is any impact.

They are not exactly unreachable. Once a node goes down, depending on nature of the deployment pods will get rescheduled to other nodes. Application will continue to work.

Once a new node joins up, there will be readjustment (reclaim unused/unreachable ip address range) so 100% of subnet is usable for pod's.

For e.g, with kops provisioned cluster with auto-scale group set to minimum of 3 instances . this is the ipam status I see once i delete a node, and after node is re-provisioned.

admin@ip-172-20-51-168:~$ curl http://127.0.0.1:6784/status/ipam
9e:b9:85:2c:70:1b(ip-172-20-51-168.us-west-2.compute.internal)   524288 IPs (25.0% of total) (5 active)
b6:d8:57:4e:85:3d(ip-172-20-57-116.us-west-2.compute.internal)   786432 IPs (37.5% of total)
d2:3c:82:07:18:da(ip-172-20-81-221.us-west-2.compute.internal)   524288 IPs (25.0% of total) - unreachable!
da:97:1b:c4:96:6b(ip-172-20-43-182.us-west-2.compute.internal)   262144 IPs (12.5% of total)
admin@ip-172-20-51-168:~$ curl http://127.0.0.1:6784/status/ipam
9e:b9:85:2c:70:1b(ip-172-20-51-168.us-west-2.compute.internal)   524288 IPs (25.0% of total) (5 active)
b6:d8:57:4e:85:3d(ip-172-20-57-116.us-west-2.compute.internal)   786432 IPs (37.5% of total)
52:97:88:4a:50:36(ip-172-20-65-39.us-west-2.compute.internal)   524288 IPs (25.0% of total)
da:97:1b:c4:96:6b(ip-172-20-43-182.us-west-2.compute.internal)   262144 IPs (12.5% of total)

Do you see any thing problematic?

redi-vinogradov · 2018-08-09T15:25:51Z

From our experience, there is an impact. In case then pods are already allocated with IPs from some subnet block and after that, node responsible for this subnet goes down - pods are not able to communicate with others anymore.
For now, we are using a shell script which just deletes unreachable nodes and that subnet range is associated with a node which executed delete command.

bboreham · 2018-08-09T15:32:17Z

@redi-vinogradov this should not happen. Weave Net implements communication at Layer 2 - MAC to MAC, not via subnets.

Please open another issue giving details of your install and log files from the time you experienced a communication problem.

alok87 · 2018-08-09T18:19:24Z

@murali-reddy What about this #2797

In a situation such as a regularly expanding and contracting auto-scale group, the IPAM ring will eventually become clogged with peers that have gone away.

Won't this be a problem in the routing of requests.
I have never understood really what is the role of weave in request routing. Does it come into the path of the request, when requests are routed using IP-tables.

bboreham · 2018-08-10T09:55:39Z

If the clean-up does run when a new node starts, this the same as #3171.
If it doesn't, please post the logs of the weave container on that new node.

alok87 · 2018-08-12T19:19:29Z

@bboreham Sure i will get back with more info if I find a node provisioned with the same IP and its ips are not reclaimed.

But is not such nodes lying there in the IPAM ring a problem. Is not it a problem if other peers could not connect too so many dead nodes in their peer to peer connection list.

/home/weave # ./weave --local status ipam | grep unrea | grep 21.5
9a:0d:e4:d6:dd:a7(ip-10-0-21-54.ap-southeast-1.compute.internal)    49152 IPs (02.3% of total) - unreachable!

/home/weave # ./weave --local status connections | grep failed | grep 21.5
-> 10.0.21.167:6783      failed      dial tcp4 :0->10.0.21.167:6783: getsockopt: connection timed out, retry: 2018-08-12 19:21:58.008282969 +0000 UTC

No solid evidence yet but what we have observed is when we cross 50 nodes and there so many unreachable nodes the network overhead by the kubernetes n/w tend to increase by around 20ms.
It reduces when we clear this pool of IPs from this dead nodes and also when the no of nodes reduce.

I will come back with the true evidence for this observation.

bboreham · 2018-08-20T08:04:11Z

is not such nodes lying there in the IPAM ring a problem

No.

Is not it a problem if other peers could not connect too so many dead nodes in their peer to peer connection list.

You showed one node, not "so many".
Not being able to connect is expected, in a distributed system. Weave Net can cope, so long as it doesn't run out of free IP addresses entirely.

network overhead by the kubernetes n/w tend to increase by around 20ms.

I know of no mechanism to connect unreachable nodes to packet latency.

I'll close this for now; please re-open or open a new issue when you have evidence of a problem.

alok87 · 2018-08-21T18:25:32Z

The root cause of our problem of latency was because of many nodes switched back to sleeve mode and never returned to fastdp - #1737

As @bboreham said node deletion does not clear IPs but reclaims them when a new node comes with same IP and it has no impact on performance.

Thank you all

bboreham · 2018-08-23T16:39:12Z

Re-opening because we don't actually seem to have an issue that duplicates this (#3171 is a PR)

bboreham · 2018-08-28T17:00:22Z

reclaims them when a new node comes with same IP

Sorry, I don't know where you got "with same IP" from; it doesn't form part of the story that I recognize. Reclaim happens when any Weave Net pod starts.

alok87 · 2018-08-29T11:25:45Z

@bboreham ok.. wanted to mean the same.

bboreham · 2018-08-30T08:59:04Z

#3386 is a specific case which matches the title of this issue.

alok87 · 2018-10-19T20:27:23Z

We were facing 4-5 hours of increased latency in the kubernetes network. Spent couple of hours fighting it.

We upgraded weave to 2.4.1 from 2.3.0 - did not help.
We moved our pods to different nodes/new nodes - did not help.
Could not find any weave pod using sleeve connection - all were fastdp
We moved the pod to a different cluster - dropped (problem confirmed in the current cluster)
We moved the pod back to the current cluster and removed the unreachable dead nodes by using curl -H "Accept: application/json" -X DELETE 'http://localhost:6784/peer/<IP>' - Request queuing dropped in the current cluster

Looks like the deleted nodes if not removed from weave results in network latency. Not really sure it should or it should not but removing did work for us.

murali-reddy · 2018-10-22T09:44:41Z

Looks like the deleted nodes if not removed from weave results in network latency

How did you come to this conclusion? Is this something easily reproducible and what scale?

In general dealing with deleted node is a control-plane aspect of Weave I dont see any reason why it should have any impact on data-plane.

alok87 · 2018-10-22T13:49:33Z

@murali-reddy We have faced this issue of request queing increasing multiple times. As soon as we removed the dead nodes from weave network, latency dropped to the old values.

bboreham · 2018-11-01T11:15:35Z

Fixed by #3399

(the originally described issue is fixed, not any other symptoms mentioned in comments)

murali-reddy added the bug label Aug 9, 2018

murali-reddy added this to the 2.5 milestone Aug 9, 2018

bboreham added state/need-more-info and removed bug labels Aug 10, 2018

bboreham closed this as completed Aug 20, 2018

bboreham added resolution/duplicate and removed state/need-more-info labels Aug 20, 2018

bboreham reopened this Aug 23, 2018

This was referenced Aug 23, 2018

Weave not working correctly leads to containers stuck in ContainerCreating #3384

Closed

Lost K8S autoscaled nodes contacted indefinitely #3300

Closed

brb removed the resolution/duplicate label Aug 30, 2018

bboreham added the bug label Sep 11, 2018

bboreham closed this as completed Nov 1, 2018

murali-reddy mentioned this issue Nov 12, 2018

Weave contacting old hosts-- NOT listed in ipam status. How to fix? #3401

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node deletion does not clear up the IPs #3372

Node deletion does not clear up the IPs #3372

alok87 commented Aug 8, 2018

redi-vinogradov commented Aug 9, 2018

murali-reddy commented Aug 9, 2018

alok87 commented Aug 9, 2018

murali-reddy commented Aug 9, 2018 •

edited

Loading

redi-vinogradov commented Aug 9, 2018

bboreham commented Aug 9, 2018

alok87 commented Aug 9, 2018

bboreham commented Aug 10, 2018

alok87 commented Aug 12, 2018 •

edited

Loading

bboreham commented Aug 20, 2018

alok87 commented Aug 21, 2018

bboreham commented Aug 23, 2018

bboreham commented Aug 28, 2018 •

edited

Loading

alok87 commented Aug 29, 2018

bboreham commented Aug 30, 2018

alok87 commented Oct 19, 2018 •

edited

Loading

murali-reddy commented Oct 22, 2018

alok87 commented Oct 22, 2018

bboreham commented Nov 1, 2018 •

edited

Loading

Node deletion does not clear up the IPs #3372

Node deletion does not clear up the IPs #3372

Comments

alok87 commented Aug 8, 2018

Versions:

What you expected to happen?

What happened?

How to reproduce it?

Anything else we need to know?

redi-vinogradov commented Aug 9, 2018

murali-reddy commented Aug 9, 2018

alok87 commented Aug 9, 2018

murali-reddy commented Aug 9, 2018 • edited Loading

redi-vinogradov commented Aug 9, 2018

bboreham commented Aug 9, 2018

alok87 commented Aug 9, 2018

bboreham commented Aug 10, 2018

alok87 commented Aug 12, 2018 • edited Loading

bboreham commented Aug 20, 2018

alok87 commented Aug 21, 2018

bboreham commented Aug 23, 2018

bboreham commented Aug 28, 2018 • edited Loading

alok87 commented Aug 29, 2018

bboreham commented Aug 30, 2018

alok87 commented Oct 19, 2018 • edited Loading

murali-reddy commented Oct 22, 2018

alok87 commented Oct 22, 2018

bboreham commented Nov 1, 2018 • edited Loading

murali-reddy commented Aug 9, 2018 •

edited

Loading

alok87 commented Aug 12, 2018 •

edited

Loading

bboreham commented Aug 28, 2018 •

edited

Loading

alok87 commented Oct 19, 2018 •

edited

Loading

bboreham commented Nov 1, 2018 •

edited

Loading