Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScaledJob: index out of range with one broken External scaler #1669

Closed
fjmacagno opened this issue Mar 12, 2021 · 4 comments · Fixed by #1672
Closed

ScaledJob: index out of range with one broken External scaler #1669

fjmacagno opened this issue Mar 12, 2021 · 4 comments · Fixed by #1672
Assignees
Labels
bug Something isn't working
Milestone

Comments

@fjmacagno
Copy link

fjmacagno commented Mar 12, 2021

Report

When the KEDA operator tries to use a ScaledJob which has a misconfigured scaler address (though it may be any error with a ScaledJob), and there is only one ScaledJob, the operator will crash with the error panic: runtime error: index out of range [0] with length 0.

Expected Behavior

Operator logs error and continues without crashing.

Actual Behavior

Operator crashes.

Steps to Reproduce the Problem

  1. Deploy a ScaledJob with an incorrect scaler address.

Logs from KEDA operator

I0312 21:33:59.840327       1 request.go:655] Throttling request took 1.042572281s, request: GET:https://172.20.0.1:443/apis/autoscaling/v2beta1?timeout=32s
2021-03-12T21:33:59.843Z	INFO	controller-runtime.metrics	metrics server is starting to listen	{"addr": ":8080"}
2021-03-12T21:33:59.844Z	INFO	controllers.ScaledObject	Running on Kubernetes 1.18+	{"version": "v1.18.9-eks-d1db3c"}
2021-03-12T21:33:59.845Z	INFO	setup	Starting manager
2021-03-12T21:33:59.845Z	INFO	setup	KEDA Version: 2.1.0
2021-03-12T21:33:59.845Z	INFO	setup	Git Commit: 4866ce69c4897df532b43390bafe4477275bf65a
2021-03-12T21:33:59.845Z	INFO	setup	Go Version: go1.15.6
2021-03-12T21:33:59.845Z	INFO	setup	Go OS/Arch: linux/amd64
I0312 21:33:59.845296       1 leaderelection.go:243] attempting to acquire leader lease keda/operator.keda.sh...
2021-03-12T21:33:59.845Z	INFO	controller-runtime.manager	starting metrics server	{"path": "/metrics"}
I0312 21:34:17.252946       1 leaderelection.go:253] successfully acquired lease keda/operator.keda.sh
2021-03-12T21:34:17.253Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "source": "kind source: /, Kind="}
2021-03-12T21:34:17.253Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob", "source": "kind source: /, Kind="}
2021-03-12T21:34:17.353Z	INFO	controller	Starting Controller	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob"}
2021-03-12T21:34:17.353Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "source": "kind source: /, Kind="}
2021-03-12T21:34:17.353Z	INFO	controller	Starting workers	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob", "worker count": 1}
2021-03-12T21:34:17.353Z	INFO	controllers.ScaledJob	Reconciling ScaledJob	{"ScaledJob.Namespace": "vineyard", "ScaledJob.Name": "vineyard-consumer-test-resource"}
2021-03-12T21:34:17.353Z	INFO	controllers.ScaledJob	Detected ScaleType = Job	{"ScaledJob.Namespace": "vineyard", "ScaledJob.Name": "vineyard-consumer-test-resource"}
2021-03-12T21:34:17.453Z	INFO	controller	Starting Controller	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject"}
2021-03-12T21:34:17.453Z	INFO	controller	Starting workers	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "worker count": 1}
2021-03-12T21:34:17.453Z	INFO	controllers.ScaledJob	Deleting jobs owned by the previous version of the scaledJob	{"ScaledJob.Namespace": "vineyard", "ScaledJob.Name": "vineyard-consumer-test-resource", "Number of jobs to delete": 8}
2021-03-12T21:34:17.453Z	INFO	controllers.ScaledJob	Initializing Scaling logic according to ScaledObject Specification	{"ScaledJob.Namespace": "vineyard", "ScaledJob.Name": "vineyard-consumer-test-resource"}
2021-03-12T21:34:17.620Z	ERROR	external_scaler	error	{"error": "rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: lookup vineyard-scaler-service.default.svc.cluster.local on 172.20.0.10:53: no such host\""}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132
github.com/kedacore/keda/v2/pkg/scalers.(*externalScaler).GetMetricSpecForScaling
	/workspace/pkg/scalers/external_scaler.go:152
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScaledJobScalers
	/workspace/pkg/scaling/scale_handler.go:236
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
	/workspace/pkg/scaling/scale_handler.go:198
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop
	/workspace/pkg/scaling/scale_handler.go:130
panic: runtime error: index out of range [0] with length 0
goroutine 353 [running]:
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScaledJobScalers(0xc000bd2940, 0x2a098a0, 0xc00004cf80, 0xc000db7a40, 0x1, 0x1, 0xc000bfa380, 0x0, 0xc000256780, 0x0)
	/workspace/pkg/scaling/scale_handler.go:238 +0x99f
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers(0xc000bd2940, 0x2a098a0, 0xc00004cf80, 0x2563f20, 0xc000bfa380, 0x29d4f20, 0xc000518ef8)
	/workspace/pkg/scaling/scale_handler.go:198 +0x23f
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop(0xc000bd2940, 0x2a098a0, 0xc00004cf80, 0xc0004fe500, 0x2563f20, 0xc000bfa380, 0x29d4f20, 0xc000518ef8)
	/workspace/pkg/scaling/scale_handler.go:130 +0x205
created by github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).HandleScalableObject
	/workspace/pkg/scaling/scale_handler.go:98 +0x385

KEDA Version

2.1.0

Kubernetes Version

1.18

Platform

Amazon Web Services

Scaler Details

Custom

@fjmacagno fjmacagno added the bug Something isn't working label Mar 12, 2021
@zroubalik zroubalik self-assigned this Mar 15, 2021
@zroubalik zroubalik added this to the v2.2 milestone Mar 15, 2021
@zroubalik
Copy link
Member

@fjmacagno This is happening only if you use ExternalScaler, right?

@fjmacagno
Copy link
Author

Yes

@coderanger
Copy link
Contributor

Unfortunately this is not limited to any one scaler. There's another issue open for it already but its a general thing. I'm 99% sure it's because we don't deepcopy the objects before launching the background scaler goroutines but haven't had cycles to check because work crunch time. Happens about once an hour to me using the rabbit scaler.

@zroubalik zroubalik changed the title index out of range with one broken scaler ScaledJob: index out of range with one broken External scaler Mar 15, 2021
@zroubalik
Copy link
Member

This is not the same issue, this is related to ScaledJob, when there is one External Scaler with incorrect metadata. Then there s a problem when returning metricSpecs, which are not returned and this could result in Operator panic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants