-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Goroutine leak in Filebeat with Kubernetes autodiscovery #23658
Comments
Pinging @elastic/integrations (Team:Integrations) |
Hello @crisdarocha , I've tested the same configuration with the Filebeat version 7.10.2. There were no crashes due to race condition for 15 hours, but goroutines keep piling up, so it doesn't fix the leak. I've collected some more debug information from the pod running 7.10.2 Attaching this information here, since upload functionality didn't work on the support website for me |
Hi @spetrashov and thank you for reporting this! First of all the fatal error reported should be fixed by #21880. Also I don't think that the fatal error should be related with any possible leaks and goroutines' number increase. From the heap graph you sent I see that heap size is ~43MB so I cannot see anything suspicious there. I'm putting the graphs you posted here in png format for quicker access: @jsoriano I think you had been dealing with some leaking cases in the past, does this one looks familiar? |
Initial error reported is fixed by #21880 and the possible memory leak should not be related so I'm closing this one and let's continue in a separate issue if needed. |
+1, this possible leak is not related to this fatal error trace.
Yes, these In the metrics I see that there were
So it actually seems that something is not being stopped when harvesters are. We will have to see if this is a regression or something that escaped to the fixes we did. It doesn't seem too serious if the memory usage is not very high, but still we should take a look. |
Let's keep this open, I will update the description. |
I can reproduce the goroutine leak also with docker autodiscover, seems to be happening since 7.8.0. What I see is that since this version, two additional |
I've tested the latest Filebeat snapshot version that includes a fix for this issue for 3 days in a test environment and the number of goroutines remains stable 👍🏻 |
Dear Beats team!
We have a possible memory leak in Filebeat (identified in 7.9.3 at least).
Running Filebeat as a Daemonset in a Kubernetes cluster, using auto-discovery, and there's a memory leak
that causes the pod to be restarted due to concurrent map read and write errors.Update: there actually seems to be a goroutine leak in Filebeat, possibly related to the family of leaks seen in #12106 and #11263, but this is not related to the race condition originally reported.
Check if this is a regression or something that escaped #11263.
Trace originally reported:
fatal error: concurrent map read and map write
goroutine 1234 [running]:
runtime.throw()
/usr/local/go/src/runtime/panic.go:1116
runtime.mapaccess2_faststr()
/usr/local/go/src/runtime/map_faststr.go:116
github.com/elastic/beats/v7/libbeat/common/kubernetes/k8skeystore.(*KubernetesKeystoresRegistry).GetKeystore()
/go/src/github.com/elastic/beats/libbeat/common/kubernetes/k8skeystore/kubernetes_keystore.go:79
github.com/elastic/beats/v7/libbeat/autodiscover.Builders.GetConfig()
/go/src/github.com/elastic/beats/libbeat/autodiscover/builder.go:102
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*Provider).publish()
/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go:148
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*Provider).publish-fm()
/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go:141
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).emitEvents()
/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:428
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).emit()
/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:270
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).OnUpdate.func1()
/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:142
runtime.goexit()
I hope this helps tracking the issue!
The text was updated successfully, but these errors were encountered: