You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So we had a small issue in our cluster with KEDA not autoscaling pods any longer. After some research it turned out we set a wrong value for the role-arn annotation in EKS.
2023-11-21 14:40:21 | log="E1121 14:40:21.150843 1 provider.go:124] keda_metrics_adapter/provider \"msg\"=\"error getting metric for scaler\" \"error\"=\"WebIdentityErr: failed to retrieve credentials\\ncaused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/id/1234567890\\n\\tstatus code: 400, request id: xxxxx\" \"scaledObject.Name\"=\"redacted-aws-sqs-queue-scaledobject\" \"scaledObject.Namespace\"=\"redacted\" \"scaler\"={}\n"
-- | --
This was hard to catch because the pod kept running without erroring.
Expected Behavior
I would have expected the pod to go in crashloopbackoff
Actual Behavior
The pod keeps running only spawning error logs:
2023-11-21 14:40:21 | log="E1121 14:40:21.150843 1 provider.go:124] keda_metrics_adapter/provider \"msg\"=\"error getting metric for scaler\" \"error\"=\"WebIdentityErr: failed to retrieve credentials\\ncaused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/id/1234567890\\n\\tstatus code: 400, request id: xxxx\" \"scaledObject.Name\"=\"redacted-aws-sqs-queue-scaledobject\" \"scaledObject.Namespace\"=\"redacted\" \"scaler\"={}\n"
-- | --
Steps to Reproduce the Problem
Deploy KEDA in AWS EKS through helm chart
Set wrong eks service account annotation with wrong account id eks.amazonaws.com/role-arn: arn:aws:iam::wrong-account-id:role/keda-role
Setup a scaledJob on SQS and make sure it triggers the scaling
Now. the errors should appear but the pod will not crash
Logs from KEDA operator
No response
KEDA Version
2.8.1
Kubernetes Version
1.23
Platform
Amazon Web Services
Scaler Details
AWS SQS
Anything else?
No response
The text was updated successfully, but these errors were encountered:
Hello @nielstenboom ,
We shouldn't go into crash loop because AWS auth it's just one of the possible scalers/auths working in the cluster. KEDA handles the error and shows a message in the log and also in the exported metrics, so I'd not say that it's a silent fail.
KEDA 2.8.1 is quite old version and we have improved a lot the observability during the last year, but despite that, in v2.8 you have some useful metrics: keda.sh/docs/2.8/operate/prometheus
You should see any error in keda_metrics_adapter_scaler_errors
Report
So we had a small issue in our cluster with KEDA not autoscaling pods any longer. After some research it turned out we set a wrong value for the role-arn annotation in EKS.
This was hard to catch because the pod kept running without erroring.
Expected Behavior
I would have expected the pod to go in crashloopbackoff
Actual Behavior
The pod keeps running only spawning error logs:
Steps to Reproduce the Problem
eks.amazonaws.com/role-arn: arn:aws:iam::wrong-account-id:role/keda-role
Logs from KEDA operator
No response
KEDA Version
2.8.1
Kubernetes Version
1.23
Platform
Amazon Web Services
Scaler Details
AWS SQS
Anything else?
No response
The text was updated successfully, but these errors were encountered: