-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eks.AlbController - helm error "UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress" #27641
Comments
This is an underlying Kubernetes issue rearing its head: Your Helm chart is never finishing deployment. This description lines up very nicely with this third party discussion: https://medium.com/nerd-for-tech/kubernetes-helm-error-upgrade-failed-another-operation-install-upgrade-rollback-is-in-progress-52ea2c6fcda9 StackOverflow shows that the error is from somewhere deep in Kubernetes/Helm, as the issue appears on Azure as well: https://stackoverflow.com/questions/71599858/upgrade-failed-another-operation-install-upgrade-rollback-is-in-progress |
@indrora Thanks, that does seem to be the problem. The catch is that in a CDK environment, Helm charts are executed by the cluster layer (e.g., I'm attempting to circumvent this Helm issue by installing the I'm wondering if there's a way to utilize By the way, am I the only one encountering this issue on AWS? I've recreated numerous stacks and clusters, but I consistently run into this problem. I'm curious about how others are managing to create their ALBControllers with CDK. I must be overlooking something. |
Good news - I've identified the problem. It turns out that the node resources were insufficient. The I've made the adjustments below, and now everything is working perfectly. export class CdkEksXp05Stack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const stack = this;
const cluster = new eks.Cluster(stack, CLUSTER_NAME, {
clusterName: CLUSTER_NAME,
version: eks.KubernetesVersion.V1_27,
defaultCapacity: 2,
defaultCapacityInstance: new ec2.InstanceType('t3.large'),
kubectlLayer: new KubectlV27Layer(stack, 'kubectl'),
});
// #region --- ALB
const albController = new eks.AlbController(stack, 'AlbController', {
cluster,
version: eks.AlbControllerVersion.V2_5_1,
});
// #endregion --- ALB
}
} Note: As a precaution, I initially deployed with the ALB section commented out, then uncommented it and deployed a second time. This was to prevent the entire cluster from rolling back in case of an issue. However, I anticipate that everything should work in a single deploy. Additional Note: I came to understand that there was a resource issue when I reconfigured the LBC installation to the Kubernetes method via CDK. The process halted during the LBC deployment because of insufficient resources. This circumstance suggested that the problem wasn't related to Helm but was instead due to a resource limitation. This situation appears to be an issue related to either the documentation or the need for more precise error messaging (though the latter might not be an easy fix). From my side, we can close this issue. (Not sure if I should be the one doing it). Thanks, @indrora, for your input. |
|
I still get this error even with large capacity :( |
Describe the bug
When doing a simple
eks.Cluster
with a simpleeks.AlbController
I get
CREATE_FAILED | Custom::AWSCDK-EKS-HelmChart ... UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
error.Expected Behavior
AlbController created.
Current Behavior
I am getting this error:
Reproduction Steps
Even with the
albController
property inside the cluster, we get this error.I tried many various options, with and without the default capacity, with or without the kubectlLayer, but still got the same error.
Possible Solution
No response
Additional Information/Context
This appears to mirror a previously closed issue/discussion: #19705.
Additionally, I encountered a failure when attempting to create the cluster in one go with the
.albController
property.Subsequently, I first set up the cluster without the ALB, and that was successful. However, upon adding the
const albController...
, I faced the same error again.CDK CLI Version
2.102.0 (build 2abc59a)
Framework Version
No response
Node.js Version
v20.8.1
OS
Mac
Language
TypeScript
Language Version
Typescript Version 5.2.2
Other information
eks: 1.27
The text was updated successfully, but these errors were encountered: