Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Panic in K8s autodiscover #21843

Closed
boernd opened this issue Oct 15, 2020 · 6 comments · Fixed by #21880
Closed

[Filebeat] Panic in K8s autodiscover #21843

boernd opened this issue Oct 15, 2020 · 6 comments · Fixed by #21880
Assignees
Labels
bug Team:Platforms Label for the Integrations - Platforms team

Comments

@boernd
Copy link
Contributor

boernd commented Oct 15, 2020

After upgrading from 7.5.2 to 7.9.2 I observe sporadic container crashes with the following stacktrace:

fatal error: concurrent map read and map write

goroutine 5766 [running]:
runtime.throw(0x3d8cd28, 0x21)
	/usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc0009c9350 sp=0xc0009c9320 pc=0x16577d2
runtime.mapaccess2_faststr(0x38d5520, 0xc0016ef470, 0xc0016f6380, 0xe, 0xc00249cb40, 0x0)
	/usr/local/go/src/runtime/map_faststr.go:116 +0x47c fp=0xc0009c93c0 sp=0xc0009c9350 pc=0x163616c
github.com/elastic/beats/v7/libbeat/common/kubernetes/k8skeystore.(*KubernetesKeystoresRegistry).GetKeystore(0xc003adbc20, 0xc005fbade0, 0x0, 0x0)
	/go/src/github.com/elastic/beats/libbeat/common/kubernetes/k8skeystore/kubernetes_keystore.go:79 +0x110 fp=0xc0009c9498 sp=0xc0009c93c0 pc=0x2b15240
github.com/elastic/beats/v7/libbeat/autodiscover.Builders.GetConfig(0x0, 0x0, 0x0, 0x4255120, 0xc003adbc20, 0xc005fbade0, 0xc003adbc20, 0xc006850c30, 0x0)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/builder.go:102 +0x281 fp=0xc0009c9550 sp=0xc0009c9498 pc=0x1e7e0a1
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*Provider).publish(0xc000d44e70, 0xc006850c30)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go:148 +0x236 fp=0xc0009c9620 sp=0xc0009c9550 pc=0x2b17436
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*Provider).publish-fm(0xc006850c30)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go:141 +0x34 fp=0xc0009c9640 sp=0xc0009c9620 pc=0x2b1fd84
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).emitEvents(0xc001967c00, 0xc0001c5000, 0x3d392f8, 0x4, 0xc0016b02c0, 0x1, 0x1, 0xc00166a380, 0x1, 0x1)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:428 +0xa82 fp=0xc0009c9f50 sp=0xc0009c9640 pc=0x2b1be92
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).emit(0xc001967c00, 0xc0001c5000, 0x3d392f8, 0x4)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:270 +0x95 fp=0xc0009c9fb0 sp=0xc0009c9f50 pc=0x2b1b385
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).OnUpdate.func1()
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:142 +0x48 fp=0xc0009c9fe0 sp=0xc0009c9fb0 pc=0x2b1fb58
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc0009c9fe8 sp=0xc0009c9fe0 pc=0x168b5a1
created by time.goFunc
	/usr/local/go/src/time/sleep.go:168 +0x44

I use Kubernetes autodiscover with a lot of conditions, e.g.

  filebeat.autodiscover:
    providers:
      - type: kubernetes
        hints.enabled: false
        templates:
          - condition:
              or:
                - equals:
                    kubernetes.container.name: x
                - equals:
                    kubernetes.container.name: y
                - equals:
                    kubernetes.container.name: z
            config:
              - type: container
                ignore_older: 10m
                multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}\s'
                multiline.negate: true
                multiline.match: after
                paths:
                  - /var/lib/docker/containers/${data.kubernetes.container.id}/*-json.log
                  - /var/lib/docker/containers/${data.kubernetes.container.id}/*-json.log.*
                fields:
                  cloud_region: "xyz"
                  event:
                    category: "xyz"
                fields_under_root: true
                tail_files: true
                clean_removed: true
                scan_frequency: "5s"
[...]

Filebeat Version: 7.9.2

Operating System: Official docker image running on OpenShift 3.11

Discuss Forum URL: https://discuss.elastic.co/t/filebeat-panic-in-k8s-autodiscover/252144

Steps to Reproduce: Not sure howto at the moment. Since the rollout of the new version (~14 hour timeframe) it happened three times (a total of 202 Filebeat pods are deployed and running as a daemonset).

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 15, 2020
@andresrc andresrc added the Team:Platforms Label for the Integrations - Platforms team label Oct 16, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@boernd
Copy link
Contributor Author

boernd commented Oct 26, 2020

@ChrsMark Hm, after upgrading it still panics but the stacktrace is slightly different (libbeat/autodiscover/template.Mapper.GetConfig):

fatal error: concurrent map writes

goroutine 9932 [running]:
runtime.throw(0x3d6332a, 0x15)
	/usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc000a31338 sp=0xc000a31308 pc=0x1657892
runtime.mapassign_faststr(0x38d5a80, 0xc002239aa0, 0x3d42efb, 0x9, 0x60b16e0)
	/usr/local/go/src/runtime/map_faststr.go:211 +0x3f7 fp=0xc000a313a0 sp=0xc000a31338 pc=0x1636637
github.com/elastic/beats/v7/libbeat/common/kubernetes/k8skeystore.(*KubernetesKeystoresRegistry).GetKeystore(0xc0016d8800, 0xc003109710, 0x3b9ebe0, 0xc000ac7458)
	/go/src/github.com/elastic/beats/libbeat/common/kubernetes/k8skeystore/kubernetes_keystore.go:83 +0x22a fp=0xc000a31478 sp=0xc000a313a0 pc=0x2b1541a
github.com/elastic/beats/v7/libbeat/autodiscover/template.Mapper.GetConfig(0xc00056e400, 0x22, 0x40, 0x42afb20, 0xc0008eba00, 0x4255640, 0xc0016d8800, 0xc003109710, 0xc000b0a000, 0x7fbe18d0d108, ...)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/template/config.go:95 +0x413 fp=0xc000a31550 sp=0xc000a31478 pc=0x1e7fb63
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*Provider).publish(0xc002232420, 0xc003109710)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go:143 +0xae fp=0xc000a31620 sp=0xc000a31550 pc=0x2b1736e
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*Provider).publish-fm(0xc003109710)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go:141 +0x34 fp=0xc000a31640 sp=0xc000a31620 pc=0x2b1fe44
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).emitEvents(0xc0023e6480, 0xc000184c00, 0x3d39878, 0x4, 0xc000ab8f20, 0x1, 0x1, 0xc00240e700, 0x1, 0x1)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:428 +0xa82 fp=0xc000a31f50 sp=0xc000a31640 pc=0x2b1bf52
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).emit(0xc0023e6480, 0xc000184c00, 0x3d39878, 0x4)
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:270 +0x95 fp=0xc000a31fb0 sp=0xc000a31f50 pc=0x2b1b445
github.com/elastic/beats/v7/libbeat/autodiscover/providers/kubernetes.(*pod).OnDelete.func1()
	/go/src/github.com/elastic/beats/libbeat/autodiscover/providers/kubernetes/pod.go:174 +0x58 fp=0xc000a31fe0 sp=0xc000a31fb0 pc=0x2b1fce8
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc000a31fe8 sp=0xc000a31fe0 pc=0x168b661
created by time.goFunc
	/usr/local/go/src/time/sleep.go:168 +0x44

Shall I open a new issue?

@ChrsMark
Copy link
Member

Hi @boernd! What kind of upgrade did you perform? The fix should be in with 7.9.3 https://www.elastic.co/guide/en/beats/libbeat/7.9/release-notes-7.9.3.html

The line the error indicates was writing in the map before the fix:

kr.kubernetesKeystores["namespace"] = k8sKeystore

@boernd
Copy link
Contributor Author

boernd commented Oct 26, 2020

Strange, pod description gives me

    Image:         docker.elastic.co/beats/filebeat:7.9.3

Upgrade was just updating the Helm values.yaml but will check again tomorrow just to be sure.

@boernd
Copy link
Contributor Author

boernd commented Oct 26, 2020

@ChrsMark The line mentioned above is still in the v7.9.3 tag? I also can find the commit in the 7.9 branch but not in the tagged version.

@ChrsMark
Copy link
Member

@boernd Sorry for the confusion here. I investigated this and found that the version was cut before the backport was merged so the fix didn't make it to 7.9.3. It's our fault that it was included in the release notes and we are looking into it. Sorry again for the confusion.

Not sure if you can wait for 7.10 ( it should be out soonish) but you can also try with the build candidates like docker.elastic.co/beats/filebeat:7.10.0-SNAPSHOT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants