Unexpected Load Balancer Deletion with delete_protection=true When Using IngressGroup in AWS Load Balancer Controller #3817
Labels
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
triage/unresolved
Indicates an issue that can not or will not be resolved.
Describe the bug
We are using the AWS Load Balancer Controller with a single Ingress for multiple requests. We decided to separate these into multiple Ingresses using IngressGroup. All our load balancers are marked with the annotation delete_protection=true. However, after adding the groupname annotation to the Ingress, the controller unexpectedly deleted the load balancer and created a new one. This behavior is unexpected since delete_protection=true was set.
If I attempt to delete the Ingress, the load balancer is not deleted, but when changing annotations, the controller attempts to delete the load balancer, resulting in an OperationNotPermittedException error. After this error, the controller disables delete protection and deletes the load balancer again.
This issue is critical as it affects production environments by causing downtime due to the need to update DNS, re-register all targets, and more.
The issue seems to originate from the code in load_balancer_synthesizer.go. Specifically, this section appears to be responsible for the unexpected behavior, and it might be necessary to remove or modify this logic for prevent downtime and issue from customers.
Steps to reproduce
Expected outcome
When delete_protection=true is set, the controller should not delete the load balancer regardless of changes to annotations. Additionally, the controller should not disable delete_protection to delete the load balancer.
Environment
AWS Load Balancer controller version: v2.8.2
Kubernetes version: 1.28
Using EKS: yes 1.28
Additional Context
Adding an IngressGroupName should not cause downtime. However, currently, to add an IngressGroupName, it seems the only option is to create a new Ingress and load balancer, then switch traffic, which is complicated and introduces downtime. It might be beneficial to consider a way to set a default IngressGroupName or improve the process to avoid downtime.
Relevant Issues
Issue #1: #2271
Issue #2: #3034
The text was updated successfully, but these errors were encountered: