Kube Node Ready Controller

It is common to run a number of system pods (usually as DaemonSets) on each node in a Kubernetes cluster in order to provide basic functionality. For instance, you might want to run kube2iam to control AWS IAM access for the services in your cluster, or you might run a logging-agent to automatically ship logs to a central location. Whatever your use case might be, you would expect these components to run on all nodes, ideally before "normal" services are scheduled to the nodes.

By default in Kubernetes a node is considered Ready/NotReady based on the node health independent of what system pods might be scheduled on the node.

kube-node-ready-controller adds a layer on top to indicate whether a node is ready for workloads based on a list of system pods which must be running on the node before it is considered ready.

How it works

The controller is configured with a list of pod selectors (namespace + labels) and for each node it will check if the pods are scheduled and has status ready. If all expected pods are ready it will make sure the node doesn't have the taint node.alpha.kubernetes.io/notReady-workload. If some expected pods aren't ready, it will make sure to set the taint on the node.

Setup

The kube-node-ready-controller can be run as a deployment in the cluster. See deployment.yaml.

To deploy it to your cluster modify the --pod-selector args to match your system pods. The format for the selector is <namespace>:<labelKey>=<labelValue>,<labelKey2>=<labelValue2>. Alternatively you can set the flag --pod-selector-configmap and use a configMap to configure the selectors (full example):

selectors:
- namespace: kube-system
  labels:
    foo: bar

With this approach you can change the selectors at runtime, just by updating the config map.

Once configured, deploy it by running:

$ kubectl apply -f docs/deployment.yaml

Note that we set the following toleration on the pod:

tolerations:
- key: node.alpha.kubernetes.io/notReady-workload
  operator: Exists
  effect: NoSchedule

This is done to ensure that it can be scheduled even on nodes that are not ready.

You must add the same toleration to all the system pods that should be scheduled before the node is considered ready. If you fail to add the toleration, the pod won't get scheduled and the node will thus never become ready.

Lastly you must configure the nodes to have the notReady-workload taint when they register with the cluster. This can be done by setting the flag --register-with-taints=node.alpha.kubernetes.io/notReady-workload=:NoSchedule on the kubelet.

You can also add the taint manually with kubectl to test it:

$ kubectl taint nodes <nodename> "node.alpha.kubernetes.io/notReady-workload=:NoSchedule"

Hooks

As an extra feature kube-node-ready-controller has optional support for triggering hooks when a node is marked as ready.

AWS Autoscaling Lifecycle Hook

Trigger AWS Autoscaling Group lifecycle hook when node becomes ready. This can be used to signal the Autoscaling Group that the node is in service.

Enable the hook with the flag --asg-lifecycle-hook=<hook-name>. This assumes you have a hook with the defined name on the Autoscaling groups of all the nodes managed by the controller.

TODO

Make it possible to configure pod selectors via a config map.
Instead of long polling the node list, add a Watch feature.

Name	Name	Last commit message	Last commit date
Latest commit mikkeloscar Merge pull request #6 from mikkeloscar/add-retries Jun 15, 2018 18e7710 · Jun 15, 2018 History 47 Commits
docs	docs	Rename Docs folder to docs	May 9, 2018
pkg/aws	pkg/aws	Add node startup duration metrics	May 10, 2018
.gitignore	.gitignore	No need to ignore scm-source.json	May 10, 2018
.travis.yml	.travis.yml	Update code	May 9, 2018
Dockerfile	Dockerfile	Update code	May 9, 2018
Gopkg.lock	Gopkg.lock	Add retries for updating node taints	Jun 15, 2018
Gopkg.toml	Gopkg.toml	Update code	May 9, 2018
LICENSE	LICENSE	Update docs, add LICENSE	Apr 23, 2017
Makefile	Makefile	Update code	May 9, 2018
README.md	README.md	Document ASG hook	May 9, 2018
asg_lifecycle_hook.go	asg_lifecycle_hook.go	Update code	May 9, 2018
controller.go	controller.go	Fix spelling of Observe	Jun 15, 2018
controller_test.go	controller_test.go	Make nodeNotReady taint name configurable	Jun 13, 2018
label_selector.go	label_selector.go	Fix initializing labels map	Jun 14, 2018
label_selector_test.go	label_selector_test.go	Fix initializing labels map	Jun 14, 2018
main.go	main.go	Merge pull request #6 from mikkeloscar/add-retries	Jun 15, 2018
observer.go	observer.go	Fix spelling of Observe	Jun 15, 2018
pod_selector.go	pod_selector.go	Get pod selector config from ConfigMap	May 2, 2017
pod_selector_test.go	pod_selector_test.go	Fix golint and go vet warnings	Jun 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kube Node Ready Controller

How it works

Setup

Hooks

AWS Autoscaling Lifecycle Hook

TODO

About

Releases

Packages

Contributors 2

Languages

License

mikkeloscar/kube-node-ready-controller

Folders and files

Latest commit

History

Repository files navigation

Kube Node Ready Controller

How it works

Setup

Hooks

AWS Autoscaling Lifecycle Hook

TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages