Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

watch for Kubernetes node delete events and reclaim removed peers IP space on delete event #3399

Merged
merged 8 commits into from
Nov 1, 2018

Conversation

murali-reddy
Copy link
Contributor

Changes to invoke reclaimRemovedPeers in kube-utils on Kubernetes node delete event

@bboreham
Copy link
Contributor

One thought: triggering a reclaim on all peers immediately will give a "thundering herd" effect with lots of contention over the ConfigMap we use as a lock.

Maybe have each peer sleep a random number of seconds before beginning, to avoid this?

@murali-reddy
Copy link
Contributor Author

Maybe have each peer sleep a random number of seconds before beginning, to avoid this?

Agree. I will add random delay.

@murali-reddy
Copy link
Contributor Author

Following up on the comments from the slack conversation
https://weave-community.slack.com/archives/C9N5HME4B/p1536572237000100

proposes to run another in the background to listen to Kubernetes events, and I wondered if it would be better to put it in a sidecar instead (i.e. a third container in the DaemonSet) Originally I was worried about defunct processes, but now I think that's ok because the shell will be the parent (in #3399).
Another downside of just firing off the process in the background is that if it hits a problem and dies this might not be noticed.

I'm in favor of moving the functionality into weaver. I'm afraid that over time weave-kube/launch.sh will become a new weave (shell script).

@brb Are you suggesting to move kube-utils functionality in to weaver? or complete Kubernetes bits (launch.sh+weave-kube)? That would be significant work. Change in this PR are very contained changes and I was wondering we should go for sidecar approach for now for immediate functional benefit?

@brb
Copy link
Contributor

brb commented Sep 12, 2018

@murali-reddy

Are you suggesting to move kube-utils functionality in to weaver?

Just the reclaim parts from kube-utils. It shouldn't require a lot of effort - move to a separate pkg and start the reclaiming process in a goroutine.

@bboreham
Copy link
Contributor

Chatting with Murali IRL yesterday, I felt the sidecar idea is bad because we will have to ask users to fetch two logs for troubleshooting.
I dislike the idea of weaver getting 35MB larger, too.

So, right now, I'm ok with running it as a background process from the launch.sh shell.

@brb
Copy link
Contributor

brb commented Sep 12, 2018

If so, then we need to ask k8s to pass --init to the weave Docker container (not sure whether it's possible), as the shell script is not a perfect init system.

@bboreham
Copy link
Contributor

the shell script is not a perfect init system

Can you expand? We already run stuff in the background in that shell script, and I hadn't noticed an issue.

@brb
Copy link
Contributor

brb commented Sep 12, 2018

The A simple init system section in https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/ describes it well.

@bboreham
Copy link
Contributor

That identifies a problem where the parent dies and leaves the child. Since in our case the parent is the main process of the container, if it dies the entire container will go away.

Or did I miss something?

@brb
Copy link
Contributor

brb commented Sep 12, 2018

If the shell process (parent) is terminated, the child will receive SIGKILL which is unhandled (no chance to do a proper cleanup).

@murali-reddy
Copy link
Contributor Author

At least in the context of the function this background process does, graceful shutdown is not critical

@murali-reddy murali-reddy changed the title WIP: watch for Kubernetes node delete events and reclaim removed peers on delete event WIP: watch for Kubernetes node delete events and reclaim removed peers IP space on delete event Sep 18, 2018
@murali-reddy murali-reddy changed the title WIP: watch for Kubernetes node delete events and reclaim removed peers IP space on delete event watch for Kubernetes node delete events and reclaim removed peers IP space on delete event Sep 18, 2018
@murali-reddy
Copy link
Contributor Author

I will add weave forget for the deleted node as separate PR (need to add go weave client API call for forget). Let me know if we have consensus on the approach to go ahead.

Testing done:

On AWS cluster, deleted couple of instances in ASG and verified on node delete IP space allocated for the node is reclaimed for all deleted nodes.

Let me know if its worth to write an integration test by doing kubectl delete node and verifying IP space is reclaimed.

@bboreham bboreham added this to the 2.5 milestone Sep 19, 2018
@brb
Copy link
Contributor

brb commented Sep 21, 2018

Thanks for the PR.

Let me know if we have consensus on the approach to go ahead.

I still think that we should do it properly and run the reclaimer from the weaver process.

common.Log.Debugln("registering for updates for node delete events")
nodeInformer.AddEventHandler(cache.ResourceEventHandlerFuncs{
DeleteFunc: func(obj interface{}) {
node := obj.(*v1core.Node)

This comment was marked as abuse.

This comment was marked as abuse.

common.Log.Fatalf("[kube-peers] Could not make Kubernetes connection: %v", err)
}
cml := newConfigMapAnnotations(configMapNamespace, configMapName, client)
weave := weaveapi.NewClient(os.Getenv("WEAVE_HTTP_ADDR"), common.Log)

This comment was marked as abuse.

This comment was marked as abuse.

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple more thoughts

}
}
common.Log.Debugln("[kube-peers] Nodes deleted:", nodeObj.Name)
config, err := rest.InClusterConfig()

This comment was marked as abuse.

common.Log.Fatalf("[kube-peers] Tombstone contained object that is not a Node: %#v", obj)
}
}
common.Log.Debugln("[kube-peers] Nodes deleted:", nodeObj.Name)

This comment was marked as abuse.

This comment was marked as abuse.

@murali-reddy
Copy link
Contributor Author

I have included weave connect --replace as well now. Tested by deleting/adding the nodes from ASG on AWS.

On node delete now

  • rmpeer is done so IP addresses of deleted nodes are reclaimed and we have clean weave status ipam
  • connect --replace with empty peer list: a node deleted is no longed connected and we have clean weave status connections

Please take a look.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants