Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections leak in scalers defined in ScaledJob #4011

Closed
jkroepke opened this issue Dec 13, 2022 · 2 comments · Fixed by #4012
Closed

Connections leak in scalers defined in ScaledJob #4011

jkroepke opened this issue Dec 13, 2022 · 2 comments · Fixed by #4012
Assignees
Labels
bug Something isn't working

Comments

@jkroepke
Copy link

Report

I used a ScaledJob with an postgresql to scale job based on tables entries inside postgresql

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: ai-studio
spec:
  jobTargetRef:
    parallelism: 1
    completions: 1
    activeDeadlineSeconds: 86400
    template:
      spec:
        containers:
          - name: ai-studio
            image: python
            imagePullPolicy: IfNotPresent
            env:
              - name: POSTGRESQL_DSN
                value: postgresql://{{ $.Values.global.postgresql.auth.username }}:{{ $.Values.global.postgresql.auth.password }}@{{ $.Release.Name }}-postgresql.{{ $.Release.Namespace }}.svc/{{ $.Values.global.postgresql.auth.database }}?sslmode=disable
            command:
              - /bin/bash
              - -c
              - >-
                pip3 install -r /app/requirements.txt && python3 /app/main.py
            volumeMounts:
              - name: app
                mountPath: /app/
        volumes:
          - name: app
            configMap:
              name: ai-studio-code
        restartPolicy: Never
    backoffLimit: 0

  pollingInterval: 5                          # Optional. Default: 30 seconds
  minReplicaCount: 0                          # Optional. Default: 0
  maxReplicaCount: 100                        # Optional. Default: 100

  rollout:
    strategy: gradual
    propagationPolicy: foreground

  triggers:
    - type: postgresql
      metadata:
        connectionFromEnv: POSTGRESQL_DSN
        query: "SELECT COUNT(*)::decimal FROM jobs WHERE state='queued'"
        targetQueryValue: "0"

On my local environment, pollingInterval is set to 5.

Expected Behavior

No connection leaks

Actual Behavior

KEDA opens a connection each 5 seconds and does not close it

# SELECT COUNT(*), state, query FROM pg_stat_activity GROUP BY state, query;
 count | state  |                                   query                                    
-------+--------+----------------------------------------------------------------------------
     1 | active | SELECT COUNT(*), state, query FROM pg_stat_activity GROUP BY state, query;
    78 | idle   | SELECT COUNT(*)::decimal FROM jobs WHERE state='queued'
     5 |        | 
(3 rows)

Steps to Reproduce the Problem

  1. Setup a postgresql database
  2. Setup a ScaledJob (may use the example from above)
  3. Wait until connection limit is reached

Logs from KEDA operator

22-12-13T17:28:56Z	INFO	scaleexecutor	Scaling Jobs	{"scaledJob.Name": "ai-studio", "scaledJob.Namespace": "ai-studio-demo", "Number of running Jobs": 0}
2022-12-13T17:28:56Z	INFO	scaleexecutor	Scaling Jobs	{"scaledJob.Name": "ai-studio", "scaledJob.Namespace": "ai-studio-demo", "Number of pending Jobs ": 0}
2022-12-13T17:29:01Z	ERROR	postgresql_scaler	Found error pinging postgreSQL: pq: remaining connection slots are reserved for non-replication superuser connections	{"type": "ScaledJob", "namespace": "ai-studio-demo", "name": "ai-studio", "error": "pq: remaining connection slots are reserved for non-replication superuser connections"}
github.com/kedacore/keda/v2/pkg/scalers.getConnection
	/workspace/pkg/scalers/postgresql_scaler.go:154
github.com/kedacore/keda/v2/pkg/scalers.NewPostgreSQLScaler
	/workspace/pkg/scalers/postgresql_scaler.go:48
github.com/kedacore/keda/v2/pkg/scaling.buildScaler
	/workspace/pkg/scaling/scale_handler.go:656
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).buildScalers.func1
	/workspace/pkg/scaling/scale_handler.go:532
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).buildScalers
	/workspace/pkg/scaling/scale_handler.go:536
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).performGetScalersCache
	/workspace/pkg/scaling/scale_handler.go:266
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).GetScalersCache
	/workspace/pkg/scaling/scale_handler.go:190
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
	/workspace/pkg/scaling/scale_handler.go:341
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop
	/workspace/pkg/scaling/scale_handler.go:162

KEDA Version

2.9.0

Kubernetes Version

1.25

Platform

Other

Scaler Details

Postgresql

Anything else?

Slack: https://kubernetes.slack.com/archives/CKZJ36A5D/p1670952481987269

@jkroepke jkroepke added the bug Something isn't working label Dec 13, 2022
@tomkerkhove tomkerkhove moved this to Proposed in Roadmap - KEDA Core Dec 13, 2022
@jkroepke
Copy link
Author

it seems like a regression in 2.9.0. 2.8.1 does not have this issue

postgres=# SELECT COUNT(*), state, query FROM pg_stat_activity GROUP BY state, query;
 count | state  |                                   query                                    
-------+--------+----------------------------------------------------------------------------
     1 | active | SELECT COUNT(*), state, query FROM pg_stat_activity GROUP BY state, query;
     1 | idle   | SELECT COUNT(*)::decimal FROM jobs WHERE state='queued'
     5 |        | 
(3 rows)

@zroubalik zroubalik changed the title connections leak in postgresql scaler with scaledJob Connections leak in scalers defined in ScaledJob Dec 14, 2022
@zroubalik
Copy link
Member

Thanks for reporting, seems like the scalers cache is not properly handled for ScaledJobs.

@zroubalik zroubalik self-assigned this Dec 14, 2022
Repository owner moved this from Proposed to Ready To Ship in Roadmap - KEDA Core Dec 15, 2022
@JorTurFer JorTurFer moved this from Ready To Ship to Done in Roadmap - KEDA Core Dec 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants