Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report dataset metrics in monitoring log reporter #25727

Merged

Conversation

andrewkroh
Copy link
Member

What does this PR do?

This adds periodic reporting of "dataset" metrics to the logs. This can be disabled by configuring logging.metrics.namespaces. These are the same metrics available from the HTTP monitoring endpoint under the /dataset path. This primarily improves visibility of Filebeat and Metricbeat because they make use of the "dataset" namespace for input and metricset instance metrics.

Why is it important?

Prior to 7.10 the Filebeat input metrics were reported in the logged metrics, but this was lost when the metrics moved to the "dataset" namespace. It was a useful feature to have for remote debugging.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Logs

2021-05-16T14:20:17.885-0400    INFO    [monitoring]    log/log.go:183  Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":60,"time":{"ms":60}},"total":{"ticks":250,"time":{"ms":250},"value":250},"user":{"ticks":190,"time":{"ms":190}}},"info":{"ephemeral_id":"caa6d941-b4ad-44fa-a5b2-6b07e41427d2","uptime":{"ms":30055}},"memstats":{"gc_next":35415664,"memory_alloc":17979752,"memory_sys":77415424,"memory_total":75690904,"rss":86929408},"runtime":{"goroutines":62}},"filebeat":{"events":{"active":534,"added":543,"done":9},"harvester":{"open_files":8,"running":8,"started":8}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"active":0},"type":"elasticsearch"},"pipeline":{"clients":1,"events":{"active":528,"filtered":15,"published":528,"retry":150,"total":543},"queue":{"max_events":4096}}},"registrar":{"states":{"current":9,"update":9},"writes":{"success":9,"total":9}},"system":{"cpu":{"cores":12},"load":{"1":1.6616,"15":2.3765,"5":2.1978,"norm":{"1":0.1385,"15":0.198,"5":0.1831}}}}, "dataset": {"069dc32a-b71f-4344-a600-fa06d5071036":{"last_event_published_time":"2021-05-16T14:19:47.918Z","last_event_timestamp":"2021-05-16T14:19:47.918Z","name":"/Users/akroh/go/src/github.com/elastic/beats/x-pack/filebeat/module/panw/panos/test/pan_inc_traffic_ietf.log","read_offset":51624,"size":51624,"start_time":"2021-05-16T14:19:47.886Z"},"0a414085-71c7-4e62-86c8-d4ef09a2292b":{"last_event_published_time":"2021-05-16T14:19:47.888Z","last_event_timestamp":"2021-05-16T14:19:47.888Z","name":"/Users/akroh/go/src/github.com/elastic/beats/x-pack/filebeat/module/panw/panos/test/pan_inc_other.log","read_offset":6217,"size":6217,"start_time":"2021-05-16T14:19:47.884Z"},"12be14ea-a067-480f-9b3a-7f043a826068":{"last_event_published_time":"2021-05-16T14:19:47.886Z","last_event_timestamp":"2021-05-16T14:19:47.885Z","name":"/Users/akroh/go/src/github.com/elastic/beats/x-pack/filebeat/module/panw/panos/test/global_protect.log","read_offset":1641,"size":1641,"start_time":"2021-05-16T14:19:47.884Z"},"12cf5432-b0ff-4cb6-8db3-098e54b9cc4b":{"last_event_published_time":"2021-05-16T14:19:47.898Z","last_event_timestamp":"2021-05-16T14:19:47.898Z","name":"/Users/akroh/go/src/github.com/elastic/beats/x-pack/filebeat/module/panw/panos/test/userid.log","read_offset":2626,"size":2626,"start_time":"2021-05-16T14:19:47.890Z"},"8c19b7d1-a670-4e91-9950-6a96e0c49457":{"last_event_published_time":"2021-05-16T14:19:47.920Z","last_event_timestamp":"2021-05-16T14:19:47.920Z","name":"/Users/akroh/go/src/github.com/elastic/beats/x-pack/filebeat/module/panw/panos/test/traffic.log","read_offset":46385,"size":46385,"start_time":"2021-05-16T14:19:47.888Z"},"926dd152-78a7-45e8-abea-778e22794590":{"last_event_published_time":"2021-05-16T14:19:47.906Z","last_event_timestamp":"2021-05-16T14:19:47.905Z","name":"/Users/akroh/go/src/github.com/elastic/beats/x-pack/filebeat/module/panw/panos/test/pan_inc_traffic.log","read_offset":36298,"size":36298,"start_time":"2021-05-16T14:19:47.885Z"},"c089fcd5-1822-4017-9739-4890d65ca098":{"last_event_published_time":"2021-05-16T14:19:47.915Z","last_event_timestamp":"2021-05-16T14:19:47.915Z","name":"/Users/akroh/go/src/github.com/elastic/beats/x-pack/filebeat/module/panw/panos/test/threat.log","read_offset":41586,"size":41586,"start_time":"2021-05-16T14:19:47.887Z"},"ddec21fc-29fb-4ed5-b23c-2cc6f6f09d12":{"last_event_published_time":"2021-05-16T14:19:47.914Z","last_event_timestamp":"2021-05-16T14:19:47.914Z","name":"/Users/akroh/go/src/github.com/elastic/beats/x-pack/filebeat/module/panw/panos/test/pan_inc_threat.log","read_offset":39870,"size":39870,"start_time":"2021-05-16T14:19:47.884Z"}}}}
2021-05-16T14:20:47.883-0400    INFO    [monitoring]    log/log.go:183  Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":66,"time":{"ms":6}},"total":{"ticks":262,"time":{"ms":12},"value":262},"user":{"ticks":196,"time":{"ms":6}}},"info":{"ephemeral_id":"caa6d941-b4ad-44fa-a5b2-6b07e41427d2","uptime":{"ms":60054}},"memstats":{"gc_next":35415664,"memory_alloc":19664576,"memory_total":77375728,"rss":87179264},"runtime":{"goroutines":62}},"filebeat":{"harvester":{"open_files":8,"running":8}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"active":0}},"pipeline":{"clients":1,"events":{"active":528,"retry":50}}},"registrar":{"states":{"current":9}},"system":{"load":{"1":1.6016,"15":2.3442,"5":2.1299,"norm":{"1":0.1335,"15":0.1954,"5":0.1775}}}}}}
2021-05-16T14:23:35.401-0400    INFO    [monitoring]    log/log.go:191  Total metrics   {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":107,"time":{"ms":107}},"total":{"ticks":375,"time":{"ms":375},"value":375},"user":{"ticks":268,"time":{"ms":268}}},"info":{"ephemeral_id":"caa6d941-b4ad-44fa-a5b2-6b07e41427d2","uptime":{"ms":227570}},"memstats":{"gc_next":35912432,"memory_alloc":22875816,"memory_sys":77415424,"memory_total":86230296,"rss":92229632},"runtime":{"goroutines":10}},"filebeat":{"events":{"active":535,"added":544,"done":9},"harvester":{"closed":8,"open_files":0,"running":0,"skipped":0,"started":8},"input":{"log":{"files":{"renamed":0,"truncated":0}},"netflow":{"flows":0,"packets":{"dropped":0,"received":0}}}},"libbeat":{"config":{"module":{"running":0,"starts":0,"stops":0},"reloads":0,"scans":0},"output":{"events":{"acked":0,"active":0,"batches":0,"dropped":0,"duplicates":0,"failed":0,"toomany":0,"total":0},"read":{"bytes":0,"errors":0},"type":"elasticsearch","write":{"bytes":0,"errors":0}},"pipeline":{"clients":0,"events":{"active":529,"dropped":0,"failed":0,"filtered":15,"published":529,"retry":400,"total":544},"queue":{"acked":0,"max_events":4096}}},"registrar":{"states":{"cleanup":0,"current":9,"update":9},"writes":{"fail":0,"success":9,"total":9}},"system":{"cpu":{"cores":12},"load":{"1":3.0503,"15":2.4072,"5":2.4136,"norm":{"1":0.2542,"15":0.2006,"5":0.2011}}}}}}

@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels May 16, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented May 16, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-08-12T11:33:18.279+0000

  • Duration: 147 min 20 sec

  • Commit: 87b1c07

Test stats 🧪

Test Results
Failed 0
Passed 53000
Skipped 5318
Total 58318

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 53000
Skipped 5318
Total 58318

@andrewkroh andrewkroh force-pushed the feature/libbeat/log-report-dataset branch 3 times, most recently from 0435171 to 1a4956d Compare May 16, 2021 23:44
@ruflin
Copy link
Contributor

ruflin commented May 17, 2021

@simitt This might affect apm?

@simitt
Copy link
Contributor

simitt commented May 17, 2021

This can be disabled by configuring logging.metrics.namespaces.

@andrewkroh I haven't tested this, but does that mean that logging.metrics.enabled: false will no longer have the same effect to disable metrics in logs?

@andrewkroh
Copy link
Member Author

Hi @simitt, logging.metrics.enabled: false will continue to have the same behavior in that it entirely disables the log reporter component of the monitoring framework.

The logging.metrics.namespaces setting allows you to select what metric namespaces are reported. The old implicit behavior was logging.metrics.namespaces: [stats] where stats is the same metrics as the http endpoint /stats. This adds reporting of the dataset namespace to that list. I think only Filebeat and Metricbeat actually make use of this (e.g. monitoring.GetNamespace("dataset").GetRegistry()).

@andrewkroh
Copy link
Member Author

run tests

@mergify
Copy link
Contributor

mergify bot commented May 25, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feature/libbeat/log-report-dataset upstream/feature/libbeat/log-report-dataset
git merge upstream/master
git push upstream feature/libbeat/log-report-dataset

@andrewkroh andrewkroh requested a review from a team June 9, 2021 23:51
@botelastic
Copy link

botelastic bot commented Jul 10, 2021

Hi!
We just realized that we haven't looked into this PR in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it in as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Jul 10, 2021
@andrewkroh andrewkroh force-pushed the feature/libbeat/log-report-dataset branch from 0a94714 to 0cb9de3 Compare July 12, 2021 20:13
@botelastic botelastic bot removed the Stalled label Jul 12, 2021
@urso urso requested a review from kvch July 15, 2021 14:19
@mergify
Copy link
Contributor

mergify bot commented Jul 19, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feature/libbeat/log-report-dataset upstream/feature/libbeat/log-report-dataset
git merge upstream/master
git push upstream feature/libbeat/log-report-dataset

Copy link
Contributor

@kvch kvch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The namespace "dataset" contains lots of metrics about every open file in Filebeat. Those metrics were moved to the dataset endpoint on purpose later because users did not want to see that much information about their files. I am all for this new option, but please do not add "dataset" to the default namespaces.

This adds periodic reporting of "dataset" metrics to the logs. This can be disabled by configuring `logging.metrics.namespaces`. These are the same metrics available from the HTTP monitoring endpoint under the /dataset path. This primarily improves visibility of Filebeat and Metricbeat because they make use of the "dataset" namespace for input and metricset instance metrics.

Prior to 7.10 the Filebeat input metrics were reported in the logged metrics, but this was lost when the metrics moved to the "dataset" namespace.
@andrewkroh andrewkroh force-pushed the feature/libbeat/log-report-dataset branch from 0cb9de3 to 51fc35b Compare August 12, 2021 03:44
@andrewkroh andrewkroh force-pushed the feature/libbeat/log-report-dataset branch from f44af5d to d354f47 Compare August 12, 2021 03:49
@andrewkroh andrewkroh force-pushed the feature/libbeat/log-report-dataset branch from d354f47 to ae08c7a Compare August 12, 2021 03:49
@andrewkroh andrewkroh requested a review from kvch August 12, 2021 03:53
@andrewkroh andrewkroh added the backport-v7.15.0 Automated backport with mergify label Aug 12, 2021
@andrewkroh andrewkroh merged commit 4f7eb22 into elastic:master Aug 13, 2021
mergify bot pushed a commit that referenced this pull request Aug 13, 2021
This adds periodic reporting of "dataset" metrics to the logs (off by default). This can be enabled by setting `logging.metrics.namespaces: [stats, dataset]`. These are the same metrics available from the HTTP monitoring endpoint under the /dataset path. This primarily improves visibility of Filebeat and Metricbeat because they make use of the "dataset" namespace for input and metricset instance metrics.

Prior to 7.10 the Filebeat input metrics were reported in the logged metrics, but this was lost when the metrics moved to the "dataset" namespace.

(cherry picked from commit 4f7eb22)
andrewkroh added a commit that referenced this pull request Aug 13, 2021
This adds periodic reporting of "dataset" metrics to the logs (off by default). This can be enabled by setting `logging.metrics.namespaces: [stats, dataset]`. These are the same metrics available from the HTTP monitoring endpoint under the /dataset path. This primarily improves visibility of Filebeat and Metricbeat because they make use of the "dataset" namespace for input and metricset instance metrics.

Prior to 7.10 the Filebeat input metrics were reported in the logged metrics, but this was lost when the metrics moved to the "dataset" namespace.

(cherry picked from commit 4f7eb22)

Co-authored-by: Andrew Kroh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.15.0 Automated backport with mergify libbeat review Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants