Add node lifecycle documentation #45074

thockin · 2023-01-17T20:00:23Z

As far as I can tell, we don't have a comprehensive doc which covers the expected lifecycle of nodes in Kubernetes.

Specifically, we have lots of intersecting, async things which involve nodes. For example:

Many environments have VMs "behind" Nodes. Those VMs can be deleted without telling k8s. Then someone comes along and deletes the node "in response", but this is racy and causes confusion.
Many environments have subsystems which cross-reference things which need to coordinate with node lifecycle. E.g. the service controller puts VMs into LBs, but does so by enumerating Nodes (ignorant of the VM lifecycle).
Some components manage nodes directl (e.g. Cluster Autoscaler, Karpenter).

For an example of things that I think are "weird" for lack of docs, look at kubernetes/autoscaler#5201 (comment) . ClusterAutoscaler defines a taint which it uses to prevent work from landing on "draining" nodes (even though we have the unschedulable field already). The service LB controller currently uses that taint to manage LBs. Cluster autoscaler removes the VM from the cloud, and leaves the Node object around for someone else to clean up.

The discussion is about the MEANING of the taint, when it happens, and how to be more graceful. What we want is a clear signal that "this node is going away" and a way for 3rd parties to indicate they have work to do when that happens. It strikes me that we HAVE such a mechanism - delete and finalizers. But CA doesn't do that. I don't know why, but I suspect there are reasons. Should it evolve?

I'd like to see a sig-node (or sig-arch?) owned statement of the node lifecycle. E.g. if the "right" way to signal "this node is going away" is to delete the node, this would say that. Then we can at least say that we think CA should adopt that pattern. If we think it needs to be more sophistacted (aka complicated) then we should express that.

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2023-01-17T20:00:30Z

@thockin: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bobbypage · 2023-01-17T20:11:06Z

/cc

sftim · 2023-01-18T15:37:02Z

Once that's stated, consider transferring the issue to k/website to track putting that detail in the public docs.

/kind documentation

sftim · 2023-01-18T15:41:21Z

A bit related: am I right that the Cluster Autoscaler is using an unregistered taint - not listed in https://kubernetes.io/docs/reference/labels-annotations-taints/ - for marking node drains?

thockin · 2023-01-19T00:52:51Z

@sftim seems so - is that list generated from an in-git source other than the website?

@x13n FYI

sftim · 2023-01-19T08:40:53Z

Right now there's no autogeneration for https://kubernetes.io/docs/reference/labels-annotations-taints/

Autogenerated pages should have a footer to mark this, eg see https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/

(related to your question, @thockin , I've considered making each entry be its own source file, which might be relevant to any future thoughts on more automation)

x13n · 2023-01-19T13:33:09Z

Can we document CA taints on that website given they don't follow .*.kubernetes.io/.* pattern?

sftim · 2023-01-19T15:07:46Z

Can we document CA taints on that website given they don't follow .*.kubernetes.io/.* pattern?

The usual approach is two phase:

switch whatever component to use a valid label / annotation / taint
- optionally, retain support for the old label / annotation / taint but mark it deprecated
register and document the new thing

x13n · 2023-01-19T18:36:29Z

That makes sense, created kubernetes/autoscaler#5433 to track this.

thockin · 2023-01-19T18:42:02Z

Re: source control for annotations - I think we should have a git repo whose sole purpose is to catalog these things with metadata that is both human and machine-friendly to and can be the source of truth for pages like this. We have something internal to Google which track our own annotations and so forth. Interested?

…

On Thu, Jan 19, 2023, 12:41 AM Tim Bannister ***@***.***> wrote: Right now there's no autogeneration for https://kubernetes.io/docs/reference/labels-annotations-taints/ Autogenerated pages should have a footer to mark this, eg see https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/ (related to your question, @thockin <https://github.com/thockin> , I've considered making each entry be its own source file, which might be relevant to any future thoughts on more automation) — Reply to this email directly, view it on GitHub <#45074>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKWAVGYALYQ23NYMHYURE3WTD42HANCNFSM6AAAAAAT6IPH7U> . You are receiving this because you were mentioned.Message ID: ***@***.***>

aojea · 2023-02-05T15:10:54Z

/cc

tzneal · 2023-02-09T15:34:20Z

I'd like to see a sig-node (or sig-arch?) owned statement of the node lifecycle. E.g. if the "right" way to signal "this node is going away" is to delete the node, this would say that. Then we can at least say that we think CA should adopt that pattern.

FWIW, Karpenter currently puts a finalizer on nodes and uses that to allow users to control the node by just deleting it. We've had some discussions on if this is a good pattern and would welcome an official pattern recommendation.

sftim · 2023-02-09T16:08:37Z

/sig docs

too

ellistarn · 2023-02-09T17:59:49Z

We've been reverse engineering node lifecycle assumptions for the past 2 years in https://github.com/aws/karpenter.

Some hiccups we've had:

We add a finalizer to the node. This hooks into our cordon/drain logic, and ensures that the underlying machine is deleted before the node is gone. Customers love this feature, as it enables things like kubectl delete node -l topology.kubernetes.io/zone=us-west-2a. It's worked well so far, but we worry about training users to do this in non-karpenter environments.
We used to create the node object, since it allowed us to be more stateless. This caused a bunch of problems with gpu registration and assumptions of other systems. See: https://github.com/aws/karpenter/blob/main/designs/node-ownership.md
We had to implement a feature called "startup taints", which communicates to our scheduler that we expect the taint to be removed, so we should stop launching nodes and instead wait for the taints to come off. Support Startup Taints aws/karpenter-provider-aws#628 (comment)

I'd love to come together with the group and sort out some of the painpoints, and find a path forward that simplifies and documents node lifecycle assumptions. I'm not confident that a shutdown taint is the right direction.

sftim · 2023-02-14T18:16:17Z

What should the kubelet do if it observes its Node being deleted?

(eg: stop all local Pods, remove the kubelet's finalizer, quit)

kerthcet · 2023-02-19T14:31:33Z

We had to implement a feature called "startup taints", which communicates to our scheduler that we expect the taint to be removed, so we should stop launching nodes and instead wait for the taints to come off. aws/karpenter-provider-aws#628 (comment)

Hi @ellistarn , I'm considering adding something like daemonJob for setup/cleanup works of nodes, do you think it will help with your situation here? kubernetes/kubernetes#115716

ellistarn · 2023-02-19T17:32:04Z

Not that I'm aware of @kerthcet. The problem is more that add-ons like cilium use a custom taint to communicate node readiness to the kube scheduler, on top of existing node readiness mechanisms.

kerthcet · 2023-02-20T02:20:15Z

One thing to highlight is that: we can add some resources like labels/taints before running the job, after success, we can remove them if you want, I suppose cilium use a custom taint only in initializing, is that right? @ellistarn

k8s-triage-robot · 2023-05-21T03:11:13Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

sftim · 2023-07-07T12:24:31Z

/retitle Add node lifecycle documentation

I think the primary audience is contributors and cluster operators are a key secondary audience. Have I got that right?

tzneal · 2023-07-07T12:31:57Z

I would say direct contributors, and ecosystem contributors. E.g. if you develop some monitoring tool, you probably deeply care about this topic as well.

dims · 2023-09-07T20:36:08Z

long-term-issue (note to self)

pacoxu · 2023-11-10T10:38:54Z

Decouple TaintManager from NodeLifecycleController enhancements#3902 may be in scope of this issue.

k8s-triage-robot · 2024-02-08T11:13:42Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

sftim · 2024-02-09T13:09:35Z

/transfer website

sftim · 2024-02-09T13:31:56Z

/remove-lifecycle stale
/lifecycle frozen
/triage accepted

sftim · 2024-02-09T13:36:31Z

/priority important-longterm

dchen1107 · 2024-02-23T18:30:15Z

cc/ @wangzhen127 Can you work with @sftim and others on this?

wangzhen127 · 2024-02-29T18:23:55Z

Sure. I am still ramping up in this space. Will follow up.

sftim · 2024-02-29T18:27:37Z

work with @sftim and others on this?

The SIG to work with is SIG Docs (#sig-docs in Kubernetes' Slack workspace).

sftim · 2024-03-25T14:52:46Z

#45197 has helped with this

sftim · 2024-03-25T14:53:02Z

@wangzhen127 what help would you like here?

thockin added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. labels Jan 17, 2023

thockin assigned bobbypage Jan 17, 2023

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 17, 2023

thockin mentioned this issue Jan 17, 2023

LB Controller needs to know when a node is ready to be deleted kubernetes/autoscaler#5201

Open

k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Jan 18, 2023

sftim mentioned this issue Jan 18, 2023

docs: RFC v1beta1 aws/karpenter-provider-aws#2964

Closed

x13n mentioned this issue Jan 19, 2023

Align Cluster Autoscaler taints with k8s naming kubernetes/autoscaler#5433

Open

4 tasks

thockin mentioned this issue Feb 5, 2023

KEP-3836: Improve Kube-proxy ingress connectivity reliability kubernetes/enhancements#3837

Merged

k8s-ci-robot added the sig/docs Categorizes an issue or PR as relevant to SIG Docs. label Feb 9, 2023

tzneal mentioned this issue Feb 14, 2023

Kubelet does not try to register again if its Node object is deleted kubernetes/kubernetes#115761

Closed

aojea mentioned this issue Feb 14, 2023

kubelet should not renew the list if the object was deleted after registration kubernetes/kubernetes#115760

Closed

thockin mentioned this issue Apr 13, 2023

Kubernetes doesn't clear conntrack entry for TCP kubernetes/kubernetes#104098

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 21, 2023

thockin removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 30, 2023

k8s-ci-robot changed the title ~~Node lifecycle documentation ?~~ Add node lifecycle documentation Jul 7, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 8, 2024

k8s-ci-robot transferred this issue from kubernetes/kubernetes Feb 9, 2024

This was referenced Feb 9, 2024

Adds a new concept page for autoscaling #44959

Merged

Add page about how Kubernetes handles node shutdown #44945

Open

Write new task pages about creating Kubernetes nodes #44248

Closed

k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Feb 9, 2024

towca mentioned this issue Mar 21, 2024

Cluster Autoscaler: align workload-level APIs with Karpenter kubernetes/autoscaler#6648

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add node lifecycle documentation #45074

Add node lifecycle documentation #45074

thockin commented Jan 17, 2023 •

edited

Loading

k8s-ci-robot commented Jan 17, 2023

bobbypage commented Jan 17, 2023

sftim commented Jan 18, 2023

sftim commented Jan 18, 2023

thockin commented Jan 19, 2023

sftim commented Jan 19, 2023

x13n commented Jan 19, 2023

sftim commented Jan 19, 2023 •

edited

Loading

x13n commented Jan 19, 2023 •

edited

Loading

thockin commented Jan 19, 2023 via email

aojea commented Feb 5, 2023

tzneal commented Feb 9, 2023

sftim commented Feb 9, 2023

ellistarn commented Feb 9, 2023

sftim commented Feb 14, 2023

kerthcet commented Feb 19, 2023

ellistarn commented Feb 19, 2023

kerthcet commented Feb 20, 2023

k8s-triage-robot commented May 21, 2023

sftim commented Jul 7, 2023

tzneal commented Jul 7, 2023

dims commented Sep 7, 2023

pacoxu commented Nov 10, 2023 •

edited

Loading

k8s-triage-robot commented Feb 8, 2024

sftim commented Feb 9, 2024

sftim commented Feb 9, 2024

sftim commented Feb 9, 2024

dchen1107 commented Feb 23, 2024

wangzhen127 commented Feb 29, 2024

sftim commented Feb 29, 2024

sftim commented Mar 25, 2024

sftim commented Mar 25, 2024

Add node lifecycle documentation #45074

Add node lifecycle documentation #45074

Comments

thockin commented Jan 17, 2023 • edited Loading

k8s-ci-robot commented Jan 17, 2023

bobbypage commented Jan 17, 2023

sftim commented Jan 18, 2023

sftim commented Jan 18, 2023

thockin commented Jan 19, 2023

sftim commented Jan 19, 2023

x13n commented Jan 19, 2023

sftim commented Jan 19, 2023 • edited Loading

x13n commented Jan 19, 2023 • edited Loading

thockin commented Jan 19, 2023 via email

aojea commented Feb 5, 2023

tzneal commented Feb 9, 2023

sftim commented Feb 9, 2023

ellistarn commented Feb 9, 2023

sftim commented Feb 14, 2023

kerthcet commented Feb 19, 2023

ellistarn commented Feb 19, 2023

kerthcet commented Feb 20, 2023

k8s-triage-robot commented May 21, 2023

sftim commented Jul 7, 2023

tzneal commented Jul 7, 2023

dims commented Sep 7, 2023

pacoxu commented Nov 10, 2023 • edited Loading

k8s-triage-robot commented Feb 8, 2024

sftim commented Feb 9, 2024

sftim commented Feb 9, 2024

sftim commented Feb 9, 2024

dchen1107 commented Feb 23, 2024

wangzhen127 commented Feb 29, 2024

sftim commented Feb 29, 2024

sftim commented Mar 25, 2024

sftim commented Mar 25, 2024

thockin commented Jan 17, 2023 •

edited

Loading

sftim commented Jan 19, 2023 •

edited

Loading

x13n commented Jan 19, 2023 •

edited

Loading

pacoxu commented Nov 10, 2023 •

edited

Loading