Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable pprof for elastic-agent and beats #28983

Merged
merged 2 commits into from
Nov 22, 2021

Conversation

michel-laterman
Copy link
Contributor

@michel-laterman michel-laterman commented Nov 16, 2021

What does this PR do?

Enable the /debug/pprof/ endpoints for all beats that the elastic-agent
starts. Enable the pprof endpoints on elastic-agent if
agent.monitoring.pprof is true (default true). Agent endpoint can be
toggled in case it is located on a network and not localhost/unix
socket/windows N pipe.

Why is it important?

pprof endpoint data will be added to the diagnostics bundle

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation PR HERE
  • I have made corresponding change to the default configuration files
  • [] I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Start agent with unaltered config, test /debug/pprof/ with curl

Related issues

Enable the /debug/pprof/ endpoints for all beats that the elastic-agent
starts. Enable the pprof endpoints on elastic-agent if
agent.monitoring.pprof is true (default true). Agent endpoint can be
toggled in case it is located on a network and not localhost/unix
socket/windows N pipe.
@michel-laterman michel-laterman added enhancement backport-v8.0.0 Automated backport with mergify backport-v7.16.0 Automated backport with mergify Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Nov 16, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Nov 16, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Nov 16, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-11-18T17:24:38.366+0000

  • Duration: 88 min 45 sec

  • Commit: 6591a82

Test stats 🧪

Test Results
Failed 0
Passed 7128
Skipped 16
Total 7144

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@michel-laterman
Copy link
Contributor Author

@ruflin this is part 2 of splitting up #28798; i'll follow up with a 3rd that adds the grpc call and diagnostics commands.

@michel-laterman
Copy link
Contributor Author

/test

@@ -122,6 +122,7 @@ func (b *Monitor) EnrichArgs(spec program.Spec, pipelineID string, args []string
appendix = append(appendix,
"-E", "http.enabled=true",
"-E", "http.host="+endpoint,
"-E", "http.pprof.enabled=true",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If monitoring is enabled we always enable also pprof?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.
Also as far as my testing shows, even when the agent is not monitoring its beats the beats will bind to a socket (and pprof will be available)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this might be unexpected from a user perspective. pprof exposes quite a bit more information than monitoring as it exposes some of the internals of the process. At the same time, whoever gets the monitoring data likely already knows about the internals.

@simitt @scunningham Any concerns with this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand how the current config options play together:

agent.monitoring:
  enabled: false
  logs: false
  metrics: false
  http:
    enabled: true 

Will metrics be exposed via the http endpoint?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not enable pprof unless customer explicitly turns it on. The HTTP interface is not defended and an attacker would be able to potentially steal secrets from the heap dump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http.pprof.enabled is a libbeat setting that we are passing to the beats to expose the pprof options over the http endpoint.
When the agent passes it above, the http interface is bound to a local unix socket (or windows npipe)

@scunningham, when the elastic-agent starts a beat, the beat will bind the interface to a socket even if agent.monitoring.enabled: false, is your request to disable automatically enabling the pprof endpoints for these beats unless explicitly enabled in this case as well?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying we automatically enable pprof even if not explicitly enabled? That's not great.

Copy link
Contributor

@lykkin lykkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@@ -107,6 +107,9 @@ inputs:
# logs: false
# # enables metrics monitoring
# metrics: false
# # exposes /debug/pprof/ endpoints
# # recommended that these endpoints are only enabled if the monitoring endpoint is set to localhost
# pprof: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is pprof: true while logs and metrics are false by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the reference here may be incorrect, the default MonitoringConfig has them set to true

@@ -122,6 +122,7 @@ func (b *Monitor) EnrichArgs(spec program.Spec, pipelineID string, args []string
appendix = append(appendix,
"-E", "http.enabled=true",
"-E", "http.host="+endpoint,
"-E", "http.pprof.enabled=true",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand how the current config options play together:

agent.monitoring:
  enabled: false
  logs: false
  metrics: false
  http:
    enabled: true 

Will metrics be exposed via the http endpoint?

Copy link

@scunningham scunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke with Michel over Slack. He explained that this control will enable/disable pprof for agent. Different issue with existing beats.

@michel-laterman michel-laterman merged commit 6ad6bee into elastic:master Nov 22, 2021
@michel-laterman michel-laterman deleted the agent-enable-pprof branch November 22, 2021 20:59
mergify bot pushed a commit that referenced this pull request Nov 22, 2021
Enable the /debug/pprof/ endpoints for all beats that the elastic-agent
starts. Enable the pprof endpoints on elastic-agent if
agent.monitoring.pprof is true (default true). Agent endpoint can be
toggled in case it is located on a network and not localhost/unix
socket/windows N pipe.

(cherry picked from commit 6ad6bee)
mergify bot pushed a commit that referenced this pull request Nov 22, 2021
Enable the /debug/pprof/ endpoints for all beats that the elastic-agent
starts. Enable the pprof endpoints on elastic-agent if
agent.monitoring.pprof is true (default true). Agent endpoint can be
toggled in case it is located on a network and not localhost/unix
socket/windows N pipe.

(cherry picked from commit 6ad6bee)
michel-laterman added a commit that referenced this pull request Nov 23, 2021
Enable the /debug/pprof/ endpoints for all beats that the elastic-agent
starts. Enable the pprof endpoints on elastic-agent if
agent.monitoring.pprof is true (default true). Agent endpoint can be
toggled in case it is located on a network and not localhost/unix
socket/windows N pipe.

(cherry picked from commit 6ad6bee)

Co-authored-by: Michel Laterman <[email protected]>
michel-laterman added a commit that referenced this pull request Nov 23, 2021
Enable the /debug/pprof/ endpoints for all beats that the elastic-agent
starts. Enable the pprof endpoints on elastic-agent if
agent.monitoring.pprof is true (default true). Agent endpoint can be
toggled in case it is located on a network and not localhost/unix
socket/windows N pipe.

(cherry picked from commit 6ad6bee)

Co-authored-by: Michel Laterman <[email protected]>
@ruflin
Copy link
Contributor

ruflin commented Nov 23, 2021

@michel-laterman For the pprof enabling in Beats by default, we likely should follow up in 8.0 to not have it enabled by default if monitoring is enabled or at least have a discussion. Could you file a Github issue?

@simitt
Copy link
Contributor

simitt commented Nov 23, 2021

On cloud we set elastic-agent.monitoring.http.enabled: true - does this mean that pprof will always be enabled then?

@michel-laterman
Copy link
Contributor Author

@simitt, we should disable pprof for cloud

v1v added a commit to v1v/beats that referenced this pull request Nov 24, 2021
…ws-on-file-changes

* upstream/master:
  override host on statsd metricset (elastic#29103)
  Skip config check in autodiscover for duplicated configurations (elastic#29048)
  Change "filebeat.config.modules.enabled" to "true" (elastic#28769)
  Remove deprecated spool queue from Beats (elastic#28869)
  Add `beat` field back to beat.stats (elastic#29094)
  Revert "Move labels and annotations under kubernetes.namespace. (elastic#27917)" (elastic#29069)
  heartbeat: remove w2008 in the CI (elastic#29093)
  Remove deprecated `--template` and `--index-policy` flags (elastic#28870)
  Fix parsing of apache trace log levels (elastic#28717)
  [Elastic-Agent] IUse itnernal port for local fleet server (elastic#28993)
  [Heartbeat] Log error on dupe monitor ID instead of strict req (elastic#29041)
  Enable pprof for elastic-agent and beats (elastic#28983)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.16.0 Automated backport with mergify backport-v8.0.0 Automated backport with mergify enhancement Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants