[Elastic Agent] Refactor general Beats settings #20679

mostlyjason · 2020-08-18T22:59:50Z

Today Beats offers several global and general configuration options https://www.elastic.co/guide/en/beats/metricbeat/current/configuration-general-options.html. As we move to a single Elastic Agent policy, we want to refactor these settings and decide what makes sense to have at the Agent level, the process level, and the input level. I've combined input from @urso and @ruflin on the description below:

Lets take seccomp as an example: Will seccomp be per process or do we have a global seccomp? seccomp rules must be enforced at process level. If we create input configurations with different seccomp settings, then the Agent would need to create a process per 'profile'.

Another example for metricbeat is timeseries.enabled. This seems to be a very input specific config but if one metric input has it on and the other one off, we need to run two processes. Why is this setting enforced at a global level already? The setting just changes how events are reported. Do we really need this, or will an add_fields processor do the trick?

Overall, I think we need a general approach on how we handle these. I think a few like seccomp must become a global feature in the Agent. The others which are process specific, we probably need an expert mode where someone can specify these in the config and name the actual process. Something like:

process.metric.timeseries.enabled: true where process.metric can be mapped to either metricbeat or the collector in the future. This will allow us to migrate the backend to the collector without breaking changes.

Besides global settings we have 'internal' behavior of Beats that we might want to expose as settings. E.g. Filebeat sets up infinite retry for it's inputs by default. Actually inputs can overwrite this, but we don't expose any setting to do so. Creating a PR to introduce a per input setting like send_mode: guaranteed/retry will be quite straight forward for filebeat. Are we interested in a setting like this for Metricbeat as well?

Suggestions for global settings we have for all Beats:

tags/fields/fields_under_root: Lets remove it. Same goes for the per input fields setting. We already have the add_fields processor, which when used directly also makes it obvious when the custom fields are added
name: I'd prefer to remove it. On the other hand it is used to configure 'host.name'. But in general it would be better to use a generated ID (generated by Fleet) to uniquely identify the agent the event has been collected from.
processors: I'm in favor of removing global processor settings. Drawback would be that we'd need to add the add_x_metadata processors per input then (which can be an advantage, as not all inputs/integrations want these to be enabled by default). Currently one reason to have them global is also efficiency (multiple instances would mean multiple lookups). But this more or less pointing out a problem in our architecture and should not be solved at the configuration file level.
max_procs: This we should keep. Maybe even set it to 1 or 2 by default. For the collector I'm thinking of introducing a 'limits' namespace for global settings that configure resource limitations. Some possible settings we would add to the 'limits' namespace are (better names are welcome):
- "memory":Configure limit of MemoryCircuiteBreaker, that blocks inputs once enqueued events have reached an estimated memory usage
- "harvesters" (not sure about a good name or how broad we want to apply the setting): Log file collection is "dynamic" and itself discovers files to collect. In order to protect resource usage we need a global limit that is applied to all active log inputs (especially regarding auto-discovery for docker/k8s)
- scheduler: Heartbeat has a task scheduler that can limit the number of concurrent active Tasks. Long term it would be nice to also limit concurrent metricbeat tasks as well. This is also where the start_delay setting might be moved too.

The others which are process specific, we probably need an expert mode where someone can specify these in the config and name the actual process. Something like: process.metricbeat.timeseries.enabled: true. I'd prefer to to introduce an expert mode like this. On the other hand I'm not sure where/how to apply resource limit settings. Interestingly our output settings define/translate to named 'pipelines' internally. Although the namespace is output, to me it feels quite natural to apply limit settings to the per pipeline settings.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-08-18T22:59:52Z

Pinging @elastic/ingest-management (Team:Ingest Management)

elasticmachine · 2020-08-18T22:59:52Z

Pinging @elastic/integrations-services (Team:Services)

urso · 2020-08-19T10:58:31Z

process.metricbeat.timeseries.enabled: true

Why do we need this to be a global setting? Why is it a global setting in metricbeat? Is the future to have it always enabled?

ruflin · 2020-09-22T08:25:59Z

I've been thinking about this again and I'm hesitant to implement a generic way on how to support the global configs. I think we should tackle it one by one when we have a need / request for a certain feature and if we see a pattern after 2-3 changes, we should find a generic solution instead of doing it upfront.

I went again through all existing config options and I think there are cases, where we even come up with new solutions. Variables in the config and keystore are two good examples here. With the "dynamic input" change, both of these are likely to be supported in the future.

So my proposal is to close this issue and open specific issues for configs that are not supported and need to support so we can discuss one by one.

ph · 2020-09-22T15:10:09Z

@ruflin I agree with that.

So +1 to close it and let it drive by user requests/need. This also reduce our cognitive load to an MVP/GA.

ruflin · 2020-09-23T07:27:17Z

@exekias @andresrc WDYT? Closing ok for you?

exekias · 2020-09-23T07:29:15Z

SGTM

andresrc · 2020-09-24T07:39:13Z

+1

mostlyjason added Team:Services (Deprecated) Label for the former Integrations-Services team Team:Ingest Management labels Aug 18, 2020

mostlyjason assigned urso and ruflin Aug 18, 2020

mostlyjason changed the title ~~[Elastic Agent] Refactor global and general Beats settings~~ [Elastic Agent] Refactor general Beats settings Aug 18, 2020

ruflin unassigned urso and ruflin Aug 19, 2020

ruflin self-assigned this Sep 18, 2020

ruflin closed this as completed Sep 29, 2020

andrewstucki mentioned this issue Oct 28, 2020

[Elastic Agent] Shared Packetbeat Sniffer Configuration #22227

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Elastic Agent] Refactor general Beats settings #20679

[Elastic Agent] Refactor general Beats settings #20679

mostlyjason commented Aug 18, 2020 •

edited

Loading

elasticmachine commented Aug 18, 2020

elasticmachine commented Aug 18, 2020

urso commented Aug 19, 2020

ruflin commented Sep 22, 2020

ph commented Sep 22, 2020

ruflin commented Sep 23, 2020

exekias commented Sep 23, 2020

andresrc commented Sep 24, 2020

[Elastic Agent] Refactor general Beats settings #20679

[Elastic Agent] Refactor general Beats settings #20679

Comments

mostlyjason commented Aug 18, 2020 • edited Loading

elasticmachine commented Aug 18, 2020

elasticmachine commented Aug 18, 2020

urso commented Aug 19, 2020

ruflin commented Sep 22, 2020

ph commented Sep 22, 2020

ruflin commented Sep 23, 2020

exekias commented Sep 23, 2020

andresrc commented Sep 24, 2020

mostlyjason commented Aug 18, 2020 •

edited

Loading