You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanos, version 0.23.1 (branch: HEAD, revision: 5327cd8)
build user: root@0acc901868e9
build date: 20211005-12:08:29
go version: go1.16.8
platform: linux/amd64
Object Storage Provider:
S3
What happened:
The number of goroutines of the thanos query component explode when the number of requests is growing :
The Thanos query stack all queries and never return responses. No logs are returned because the goroutines are blocked due to a semaphore problem. Cf goroutine pprof : pprof.thanos.goroutine.005.pb.gz
What you expected to happen:
All requests must work and the number of goroutines must not grow.
How to reproduce it (as minimally and precisely as possible):
It is difficult to reproduce it. The way to reproduce it is to spam the query with a lot of requests.
Full logs to relevant components:
No logs are displayed when the partial deadlock is in place.
Anything else we need to know:
The text was updated successfully, but these errors were encountered:
The issue is fixed by #4795
Can this fix be backported to v0.23.1 to avoid outages ? This is very critical for production environments. Thanks for your support and thanks @GiedriusS for the fix 👍
Thanos, Prometheus and Golang version used:
thanos, version 0.23.1 (branch: HEAD, revision: 5327cd8)
build user: root@0acc901868e9
build date: 20211005-12:08:29
go version: go1.16.8
platform: linux/amd64
Object Storage Provider:
What happened:
The number of goroutines of the thanos query component explode when the number of requests is growing :
The Thanos query stack all queries and never return responses. No logs are returned because the goroutines are blocked due to a semaphore problem. Cf goroutine pprof :
pprof.thanos.goroutine.005.pb.gz
What you expected to happen:
All requests must work and the number of goroutines must not grow.
How to reproduce it (as minimally and precisely as possible):
It is difficult to reproduce it. The way to reproduce it is to spam the query with a lot of requests.
Full logs to relevant components:
No logs are displayed when the partial deadlock is in place.
Anything else we need to know:
The text was updated successfully, but these errors were encountered: