-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod and namespace stuck in terminating state #647
Comments
I think that's a valid point. We're making to change to only mutate pods that have labels (ref: #601) but even with that, if the pod is terminating I don't think we need to handle that request. Thanks for the detailed issue. I'll include this update in the next release. |
Closed with #652 |
Hi @aramase, I think we are seeing something similar still in AKS k patch pod podName -n namespace -p '{"metadata":{"finalizers":null}}'
Error from server: admission webhook "mutation.azure-workload-identity.io" denied the request: serviceaccounts "podName" not found |
This issue can lead to the The webhook should gracefully tolerate the absence of a service account when the pod is being modified to remove a finalizer, instead of failing with The webhook should not mutate the pod when a finalizer is being removed. See kubernetes/kubernetes#121828 (comment) To work around the issue, you can temporarily re-create the service account the The fix in #652 (included in v0.15.0) makes the webhook skip firing on |
Describe the bug
Pods with finalizers (e.g. "batch.kubernetes.io/job-tracking" when created by the Job controller in Kubernetes >= 1.23) are stuck in Terminating when deleting the Pod's namespace.
Deleting the namespace will delete the Job, the Pod and the ServiceAccount. Because of the finalizer, the Pod is only flagged as deleted. When the Job controller tries to remove the finalizer from the Pod, the update is rejected by the Azure Workload Identity mutating webhook because the ServiceAccount no longer exists. We have to stop the webhook to allow the Job controller to remove the finalizer. Once the finalizer has been removed, we can start the webhook again.
Steps To Reproduce
Create a namespace
kubectl create ns awi
Create a ServiceAccount:
Create a Job:
Delete the namespace before the sleep 300 expires
kubectl delete ns awi
The Pod and Namespace is now stuck in Terminating until the webhook is stopped.
Expected behavior
Not sure what the best/correct way to handle situations like this is. Perhaps ignore validation for pods with a deletionTimestamp?
Logs
Environment
AKS - version 1.23.8
The text was updated successfully, but these errors were encountered: