-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a blog post about decoupled taint eviction controller #43676
Conversation
✅ Pull request preview available for checkingBuilt without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify site configuration. |
/retitle [WIP] Add a blog post about decoupled taint eviction manager |
Hi @yuanchen8911 here Communication Team 1.29, the deadline to the feature blog be ready to review is this Friday, Nov 17th, the proposal publish date will be Dec 19th. cc: @a-mccarthy @kcmartin @James-Quigley @kubernetes/sig-docs-blog-owners: Blog scheduled: Dec 19th, Publication Order Nro:8 |
02e7f10
to
8d35bce
Compare
8d35bce
to
80b263f
Compare
80b263f
to
7ad71cb
Compare
/cc @atosatto |
|
||
It may slightly increase the communication overhead from applying node taints to performing pod eviction. | ||
|
||
**Will enabling/using this feature result in a non-negligible increase in resource usage (CPU, RAM, disk, IO, ...) in any components?** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no-neglisible ... can we avoid this double negating term?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the word non-negligible
.
7ad71cb
to
114621e
Compare
A new feature gate, `SeparateTaintManager`, has been added. To enable the new feature, users can use the following flags: | ||
|
||
- kube-apiserver: `--feature-gates=SeparateTaintManager=true` | ||
- kube-controller-manager: `--controllers=kube-eviction-controller` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually going to be enabled by default as the SeparateTaintManager
feature-flag is in beta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified it.
- kube-apiserver: `--feature-gates=SeparateTaintManager=true` | ||
- kube-controller-manager: `--controllers=kube-eviction-controller` | ||
|
||
For compatibility, the legacy `TaintManager` can still be used with the `--taint-manager` flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct. If users want to use the old taint-manager they need to set the feature-flag to false (e.g. SeparateTaintManager=false
).
What this feature gives the users, is the ability to disable taint-based eviction by setting SeparateTaintManager=true
and --controllers=-kube-eviction-controller
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated it.
- Adding a pre-defined set of `NoExecute` taints to nodes based on the node conditions. | ||
- Performing pod eviction on NoExecute taints | ||
|
||
With the latest release, the taint-based eviction implementation has been moved out of the `node-lifecycle-controller` into a separate and independent component called `TaintEvictionController`. This separation aims to disentangle code, enhance code maintainability, and facilitate future extensions to either component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add something high-lighting the benefits and the potential use-cases where this could be useful, as done in the KEP. I'd also mention the new metrics which have been introduced with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the following
Use Cases
This new feature will allow cluster administrators to extend and enhance the default TaintEvictionController
and even replace the default TaintEvictionController
with a custom implementation to meet different needs, e.g., better supportof stateful workloads that use PersistentVolume
on local disks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the metrics description as follows.
As part of the change, addtitonal metrics are introduced to monitor taint-based pod evictionis.
pod_deletion_duration_seconds
measures th latency between the time when a taint effect has been activated for the Pod and its deletion viaTaintEvictionController
.pod_deletions_total
reports the total number of Pods deleted byTaintEvictionController
since its start
**Will enabling/using this feature result in an increase in the time taken by any operations covered by existing SLIs/SLOs?** | ||
|
||
It may slightly increase the communication overhead from applying node taints to performing pod eviction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct, as setting taints and evicting Pods is already done via separated reconcilers in the old implementation. We strived to the changes to not impact Customers experience in any meaningful way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to No
.
|
||
As with any Kubernetes feature, multiple community members have contributed, from writing the KEP to implementing the new controller and reviewing the KEP and code. Special thanks to: | ||
|
||
- Aldo Culquicondor (@alculquicondor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like us to add here @atiratree and @logicalhan. Their feedback on the implementation, alongside @soltysh and @Huang-Wei is what made this change possible! Thank you 🙇♂️!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the names.
c078374
to
d7fbb67
Compare
/lgtm |
## How to use the new feature? | ||
|
||
A new feature gate, `SeparateTaintEvictionController`, has been added. The feature is enabled by default as Beta in Kubernetes 1.29. | ||
Please refer to the [feature gate document](/content/en/docs/reference/command-line-tools-reference/feature-gates.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should reference the relative path on the website and not on github as suggested here #43676 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
/lgtm cancel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced it with the short URL.
d7fbb67
to
f021c1b
Compare
Thanks /lgtm |
Add authors Update Fix errors and address Andrea's comments Removed duplication Update 2023-10-24-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]> Update 2023-10-24-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]> Update 2023-10-24-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]> Update 2023-10-24-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]> Update 2023-10-24-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]> Address comment and rename the file Fix format Address comments Add the feature gate reference Fix typos and format issues Fix reference link error Fix a borken link Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Tim Bannister <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Tim Bannister <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Tim Bannister <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Tim Bannister <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Tim Bannister <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Tim Bannister <[email protected]> node to Node Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Tim Bannister <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]> Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Tim Bannister <[email protected]> Fix the reference link Update content/en/blog/_posts/2023-12-19-taint-eviction-controller.md Co-authored-by: Ritika <[email protected]>
baa5038
to
fe882f1
Compare
/lgtm |
/lgtm cancel Requires new date (release of v1.29 has been postponed) |
The new publication date is 2023-12-19; please adjust the article. |
Thanks Do not merge until Kubernetes v1.29 is released. |
LGTM label has been added. Git tree hash: 2b872fda0ba6c964d2b70dc06dc0f42758e71c56
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sftim The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1.29 is released |
Add a blog for KEP 3902