Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promtail: pipeline_stage with replace stages causes panic #3353

Closed
AmitBenAmi opened this issue Feb 17, 2021 · 5 comments · Fixed by #3381
Closed

Promtail: pipeline_stage with replace stages causes panic #3353

AmitBenAmi opened this issue Feb 17, 2021 · 5 comments · Fixed by #3381
Labels
type/bug Somehing is not working as expected

Comments

@AmitBenAmi
Copy link

Describe the bug
I'm using Loki with Promtail and wanted to add pipeline_stages to redact some sensitive information (PII logs).
I configured my helm chart to the latest Promtail version (v2.1.0), and set the configuration to have these pipeline_stages (from the values.yaml file):

promtail:
  config:
    snippets:
      pipelineStages:
        - cri: {}
        - replace:
          expression: (?P<Company>([A-Z&][\w,]* )+(I[Nn][Cc](orporated)?|C[Oo](rp(oration)?)?|LLP|llc|LLC|plc|gmbh)\.?(\b|$))
          replace: COMPANY_NAME
        - replace:
          expression: (?P<Email>([a-z0-9_\-.+]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-z0-9-]+\.?)+))([a-z]{2,4}|[0-9]{1,3})(]?))
          replace: EMAIL_ADDRESS

After deploying Promtail's DaemonSet, I saw that the pods are failing with the panic:

panic: runtime error: slice bounds out of range [7186:7166]
goroutine 372 [running]:
github.com/grafana/loki/pkg/logentry/stages.(*replaceStage).getReplacedEntry(0xc00079bea0, 0xc00024cb40, 0x1, 0xa, 0xc0018ac000, 0x246e, 0xc00054e750, 0xc0019c06c0, 0xc0001164b0, 0x13ca4cede73a2c65, ...)
	/src/loki/pkg/logentry/stages/replace.go:188 +0x525
github.com/grafana/loki/pkg/logentry/stages.(*replaceStage).Process(0xc00079bea0, 0xc000fdbd40, 0xc000fdbd70, 0xc0015089e0, 0xc000782f70)
	/src/loki/pkg/logentry/stages/replace.go:142 +0x6db
github.com/grafana/loki/pkg/logentry/stages.(*Pipeline).Process(0xc0007de980, 0xc000fdbd40, 0xc000fdbd70, 0xc0015089e0, 0xc000782f70)
	/src/loki/pkg/logentry/stages/pipeline.go:109 +0x213
github.com/grafana/loki/pkg/logentry/stages.(*Pipeline).Wrap.func1(0xc000fdbd40, 0xc00377dccb4c6adb, 0x13bfdff70, 0x413f820, 0xc0018ac000, 0x246e, 0x0, 0xffffffffffffffff)
	/src/loki/pkg/logentry/stages/pipeline.go:129 +0x110
github.com/grafana/loki/pkg/promtail/api.EntryHandlerFunc.Handle(0xc00079bec0, 0xc000fdbd40, 0xc00377dccb4c6adb, 0x13bfdff70, 0x413f820, 0xc0018ac000, 0x246e, 0x203000, 0xc000fdbd10)
	/src/loki/pkg/promtail/api/types.go:24 +0x65
github.com/grafana/loki/pkg/promtail/api.AddLabelsMiddleware.func1.1(0xc000fdbd10, 0xc00377dccb4c6adb, 0x13bfdff70, 0x413f820, 0xc0018ac000, 0x246e, 0x0, 0x0)
	/src/loki/pkg/promtail/api/types.go:45 +0xa3
github.com/grafana/loki/pkg/promtail/api.EntryHandlerFunc.Handle(0xc000a5cce0, 0xc000fdbd10, 0xc00377dccb4c6adb, 0x13bfdff70, 0x413f820, 0xc0018ac000, 0x246e, 0xc001857e58, 0x8)
	/src/loki/pkg/promtail/api/types.go:24 +0x65
github.com/grafana/loki/pkg/promtail/api.AddLabelsMiddleware.func1.1(0xc000fdbcb0, 0xc00377dccb4c6adb, 0x13bfdff70, 0x413f820, 0xc0018ac000, 0x246e, 0x2941bc0, 0xc000fdbcb0)
	/src/loki/pkg/promtail/api/types.go:45 +0xa3
github.com/grafana/loki/pkg/promtail/api.EntryHandlerFunc.Handle(0xc000a5cda0, 0xc000fdbcb0, 0xc00377dccb4c6adb, 0x13bfdff70, 0x413f820, 0xc0018ac000, 0x246e, 0x0, 0x0)
	/src/loki/pkg/promtail/api/types.go:24 +0x65
github.com/grafana/loki/pkg/promtail/targets/file.(*tailer).readLines(0xc000b84200)
	/src/loki/pkg/promtail/targets/file/tailer.go:150 +0x57c
created by github.com/grafana/loki/pkg/promtail/targets/file.newTailer
	/src/loki/pkg/promtail/targets/file/tailer.go:80 +0x53d

To Reproduce
Steps to reproduce the behavior:

  1. Loki version v2.1.0
  2. Promtail version v2.1.0
  3. Helm chart version 3.0.4

Logs for example:

{app=\"my-app\", container=\"app\", filename=\"/var/log/pods/default-app-568468b57c-kcz89_5f5d6552-2519-4a4c-96fa-c076a10c74c5/app/0.log\", job=\"default/my-app\", logging_level=\"warning\", logging_level_key=\"level\", namespace=\"default\", node_name=\"node1\", pod=\"my-app-568468b57c-kcz89\", pod_template_hash=\"568468b57c\", release=\"my-app\", version=\"4d130726cbaf0d77d33c0fcb75a1a6d0a4b2de77\"}" time=2021-02-17T19:01:07.285662962Z entry="{\"log\":\"10.0.0.8 - - [17/Feb/2021:16:53:03 +0000] \\\"GET /isAlive HTTP/1.1\\\" 200 4 \\\"-\\\" \\\"kube-probe/1.18+\\\"\\n\",\"stream\":\"stdout\",\"time\":\"2021-02-17T16:53:03.542721257Z\"}
{app=\"other-app\", container=\"other-app\", filename=\"/var/log/pods/default_other-app-5c9bbb5f59-lphl8_94e19a95-e7de-4d1e-9bd0-3515bc50fcc5/other-app/0.log\", job=\"default/other-app\", logging_level=\"warning\", logging_level_key=\"level\", namespace=\"default\", node_name=\"node2\", pod=\"other-app-5c9bbb5f59-lphl8\", pod_template_hash=\"5c9bbb5f59\", release=\"other-app\"}" time=2021-02-17T19:01:07.285293446Z entry="{\"log\":\"{\\\"level\\\":\\\"info\\\",\\\"extra\\\":{\\\"executionTime\\\":1},\\\"message\\\":\\\"batch process completed\\\"}\\n\",\"stream\":\"stdout\",\"time\":\"2021-02-17T16:53:13.560236943Z\"}

I'm not sure which log caused the panic, but there are an endless number of logs, so not sure if these 2 caused that (They are the last 2).

Expected behavior
The patterns should work and when the regex is matched, the replace should work

Environment:

  • EKS, Kubernetes v1.18.9-eks-d1db3c
  • Helm
@adityacs
Copy link
Contributor

adityacs commented Feb 22, 2021

@AmitBenAmi Is it possible to share the actual logs which contains Company name and Email Address ?. You can mask the real values. Just wanted to see the pattern. It would be helpful to debug.

@AmitBenAmi
Copy link
Author

@adityacs I have hundreds of logs, and not all of them contain these values (I redact specific information if it exist).
Do you need a log the has this kind of information in it?

@AmitBenAmi
Copy link
Author

AmitBenAmi commented Feb 22, 2021

@adityacs This is a log message for example (containing an email, but couldn't find Company Name, which led me to think that currently the problem is caused only by email regex):

{"level":"debug","message":"get.success","extra":{"expertId":"GUID","executionTime":993,"executionResult":{"email":"[email protected]","id":"ID","expertise":["direct:plus","frontier:general-conversation","general-conversation","plus","plus","plus","plus"],"name":"NAME","givenName":"NAME","picture":null,"roles":["ROLE"],"pools":[],"name":"NAME","someName":"all","lastActiveTimestamp":"2021-02-22T12:47:33.548Z","ranking":{"weight":{"replace":3,"service":4,"supervisor":6,"emails":5,"demo":8,"testing":9,"callback":7},"prioritizationWeight":{"replace":3,"service":4,"supervisor":6,"video":9,"savesDesk":3,"free":2,"plus":5,"*.plus":5,"*.my-plus":5,"*.program":5,"*.emails":5,"*.demo":8,"*.callback":7,"general-conversation":3,"sgeneral-conversation":5,"consumer":5,"splus":5,"conference":5,"welcome":1,"conversation":2,"ssconversation":1,"app":8,"coach":2,"mdpp":5,"mpp":5,"prepaid":5,"ccconversation":5},"strategy":"strategy"},"isActive":true,"availability":{"isAvailable":true,"changedAt":"2021-02-22T12:03:31.470Z","unavailableReason":"","limitReached":false},"assignmentType":"get-next","waitingForSession":true,"waitingForSessionTimestamp":"2021-02-22T12:03:33.520Z"}},"@timestamp":"2021-02-22T12:48:01.417Z"}

I used this configuration (Same as above):

promtail:
  config:
    snippets:
      pipelineStages:
        - cri: {}
        - replace:
          expression: (?P<Company>([A-Z&][\w,]* )+(I[Nn][Cc](orporated)?|C[Oo](rp(oration)?)?|LLP|llc|LLC|plc|gmbh)\.?(\b|$))
          replace: COMPANY_NAME
        - replace:
          expression: (?P<Email>([a-z0-9_\-.+]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-z0-9-]+\.?)+))([a-z]{2,4}|[0-9]{1,3})(]?))
          replace: EMAIL_ADDRESS

These are my logs end from Promtail:

level=info ts=2021-02-22T14:12:46.598608598Z caller=tailer.go:131 component=tailer msg="tail routine: exited" path=/var/log/pods/default_worker-5d7b57f466-bsmv9_9bfd6f8b-51e0-41ec-bded-a2c24d938949/container/0.log
level=info ts=2021-02-22T14:12:46.59862869Z caller=tailer.go:122 component=tailer msg="tail routine: started" path=/var/log/pods/api_prod-7664469498-qqb9b_5b1bec1d-e481-412e-af37-d0b4d575302d/init/0.log
ts=2021-02-22T14:12:46.598672075Z caller=log.go:124 component=tailer level=info msg="Seeked /var/log/pods/api_prod-7664469498-qqb9b_5b1bec1d-e481-412e-af37-d0b4d575302d/init/0.log - &{Offset:384 Whence:0}"
panic: runtime error: slice bounds out of range [258:234]

goroutine 735 [running]:
github.com/grafana/loki/pkg/logentry/stages.(*replaceStage).getReplacedEntry(0xc000b20de0, 0xc0005a41e0, 0x1, 0xa, 0xc001095800, 0x7ca, 0xc0015b72c0, 0xc0020a8280, 0x0, 0xc0050cd7a3a21be3, ...)
	/src/loki/pkg/logentry/stages/replace.go:188 +0x525
github.com/grafana/loki/pkg/logentry/stages.(*replaceStage).Process(0xc000b20de0, 0xc0015b71d0, 0xc0015b7200, 0xc000abdc60, 0xc000770a40)
	/src/loki/pkg/logentry/stages/replace.go:142 +0x6db
github.com/grafana/loki/pkg/logentry/stages.(*Pipeline).Process(0xc00085dd00, 0xc0015b71d0, 0xc0015b7200, 0xc000abdc60, 0xc000770a40)
	/src/loki/pkg/logentry/stages/pipeline.go:109 +0x213
github.com/grafana/loki/pkg/logentry/stages.(*Pipeline).Wrap.func1(0xc0015b71d0, 0xc0050cd7a3a179f0, 0x14e53928f, 0x413f820, 0xc001095800, 0x7ca, 0x0, 0xffffffffffffffff)
	/src/loki/pkg/logentry/stages/pipeline.go:129 +0x110
github.com/grafana/loki/pkg/promtail/api.EntryHandlerFunc.Handle(0xc000b20e00, 0xc0015b71d0, 0xc0050cd7a3a179f0, 0x14e53928f, 0x413f820, 0xc001095800, 0x7ca, 0x203000, 0xc0015b71a0)
	/src/loki/pkg/promtail/api/types.go:24 +0x65
github.com/grafana/loki/pkg/promtail/api.AddLabelsMiddleware.func1.1(0xc0015b71a0, 0xc0050cd7a3a179f0, 0x14e53928f, 0x413f820, 0xc001095800, 0x7ca, 0x0, 0x0)
	/src/loki/pkg/promtail/api/types.go:45 +0xa3
github.com/grafana/loki/pkg/promtail/api.EntryHandlerFunc.Handle(0xc0004e5b20, 0xc0015b71a0, 0xc0050cd7a3a179f0, 0x14e53928f, 0x413f820, 0xc001095800, 0x7ca, 0xc00252de58, 0x7)
	/src/loki/pkg/promtail/api/types.go:24 +0x65
github.com/grafana/loki/pkg/promtail/api.AddLabelsMiddleware.func1.1(0xc0015b7140, 0xc0050cd7a3a179f0, 0x14e53928f, 0x413f820, 0xc001095800, 0x7ca, 0x2941bc0, 0xc0015b7140)
	/src/loki/pkg/promtail/api/types.go:45 +0xa3
github.com/grafana/loki/pkg/promtail/api.EntryHandlerFunc.Handle(0xc0004e5b80, 0xc0015b7140, 0xc0050cd7a3a179f0, 0x14e53928f, 0x413f820, 0xc001095800, 0x7ca, 0x0, 0x0)
	/src/loki/pkg/promtail/api/types.go:24 +0x65
github.com/grafana/loki/pkg/promtail/targets/file.(*tailer).readLines(0xc001788180)
	/src/loki/pkg/promtail/targets/file/tailer.go:150 +0x57c
created by github.com/grafana/loki/pkg/promtail/targets/file.newTailer
	/src/loki/pkg/promtail/targets/file/tailer.go:80 +0x53d

I noticed that there is the first line in the logs containing: msg="tail routine: exited", which is the only one in my whole Promtail log I could find, and this is how I figured which pod is crashing my Promtail.
Hope that is enough data and logs

@cyriltovena
Copy link
Contributor

ping me @adityacs if you don't get the bottom of it seems like you need to add a small test to replicate the issue first.

If you can't then there might be a race.

@AmitBenAmi
Copy link
Author

@slim-bean do you know on which version will it be released?
Maybe a patch?

@chaudum chaudum added the type/bug Somehing is not working as expected label Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Somehing is not working as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants