Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] Shared Packetbeat Sniffer Configuration #22227

Closed
andrewstucki opened this issue Oct 28, 2020 · 14 comments
Closed

[Elastic Agent] Shared Packetbeat Sniffer Configuration #22227

andrewstucki opened this issue Oct 28, 2020 · 14 comments
Labels

Comments

@andrewstucki
Copy link

While migrating Packetbeat to be managed by Agent we came across a bit of a snag when it comes to our current Packetbeat architecture. Currently Packetbeat has a series of protocols/flows that can be modeled as inputs, however it also has a shared sniffer implementation. Basically the single shared sniffer dispatches packets to any registered protocol/flow handlers that then process and publish them. Currently all of that configuration is under packetbeat.interfaces (i.e. packetbeat.interfaces.device).

Ideally we'd have a mechanism to pass this sort of singleton configuration to packetbeat, but right now we're pretty much limited to whatever comes in inputs.

This is more or less an extension of some of the conversation in #20679 and arguably falls under exactly what @ruflin advocated:

So my proposal is to close this issue and open specific issues for configs that are not supported and need to support so we can discuss one by one.

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 28, 2020
@andrewstucki andrewstucki added Team:Ingest Management and removed needs_team Indicates that the issue/PR needs a Team:* label labels Oct 28, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@ruflin
Copy link
Contributor

ruflin commented Oct 28, 2020

@blakerouse @michalpristas Would be great to get your thoughts on the above on how we could do this. I wonder how we should think of it: Config for a specific process? Config for a list of inputs? Global configs we just ship everywhere? Other?

@andrewstucki
Copy link
Author

I was thinking of an alternative approach that might get us what we need, and would like to hear thoughts about it. If, instead of approaching each protocol as an input we approach it as a stream we could potentially make this work without global configuration. Here's the rough idea, the policy handed from Kibana would look like:

inputs:
  - type: packet
    # other sniffer configuration here
    streams:
      - type: flow
        timeout: 10s
        period: 10s
        keep_null: false
        data_stream:
          dataset: packet.flow
          type: logs
      - type: icmp
        data_stream:
          dataset: packet.icmp
          type: logs

The above would allow the packetbeat spec configuration to pretty much just look like this:

# artifact stuff up here
rules:
  - filter_values:
      selector: inputs
      key: type
      values:
        - packet

  - inject_agent_info: {}

  - filter:
      selectors:
        - inputs
        - output
# when clause down here

Packetbeat would then either have to enforce inputs only ever being a single element long (to use the current behavior of having a single global sniffer) or get extended to potentially run multiple sniffers under different configurations.

The one downside to the single element inputs enforcement is that we'd tie the Packetbeat integration to a single package (maybe network packet package?).

The downside to the multiple sniffers is that it'll take a bit longer to implement and require some research (not sure how it'll effect system resources or what sort of multi-sniffer setups are actually supported by libpcap).

If we want to go the multiple sniffer route, but are ok punting on it for now, we could always do the single element enforcement for now and extend Packetbeat in the future.

Thoughts? CC: @andrewkroh

@blakerouse
Copy link
Contributor

@ruflin I prefer @andrewstucki latest approach, but I don't know if it being limited on the intergration packages side is an issue?

Would it maybe be possible for agent to just pass multiple inputs down to Packetbeat and allow packet beat to build the shared sniffer from those values. This would basically be Packetbeat understanding a top level inputs from Agent, just like we did with Heartbeat.

@andrewstucki
Copy link
Author

@blakerouse so I don't think we're actually limited anymore, I rewrote the packet sniffer management code so that we can run up to 100 sniffers in a single packetbeat instance. What that means essentially is that you can have up to 100 integrations enabled in a policy that each use the packet type input (not that you'd want to run 100 sniffers on a single system). Each package can just then specify the enabled dissectors as a stream.

That said, I'm still looking for a way of passing in input-level configuration i.e. I don't believe our current spec handles the

# other sniffer configuration here

of the above. So it would still be good to carry on this conversation.

@ruflin
Copy link
Contributor

ruflin commented Oct 30, 2020

There is an other ongoing discussion with @urso on how input should look and it goes more in the direction of, everything should be an input.

I'm personally leaning in the direction of having 1 input per protocol as I think it is also more intuitive to configure. It seems like based on your comment above, packetbeat would be able to summarise the following config to run only 2 sniffers? (see the different sniffer configs). I heavily simplified the config and it might not be correct, just for example purpose:

inputs:
  - type: network/tcp
    port: 666
    sniffer_config:
      foo: bar
  - type: network/http
    port: 80
    sniffer_config:
      foo: bar
  - type: network/icmp
    sniffer_config:
      foo: no_bar

The above also allows to potentially have a nginx packet with an option to sniff traffic on port 80 and than just ship it down as an additional input. Packetbeat will figure out with additional sniffers are needed or not.

For limitations on the packages side: Lets ignore it for now, we can fix it.

@urso
Copy link

urso commented Nov 9, 2020

The discussion mentioned is here: https://docs.google.com/document/d/1g1yBQ4W0nLNmc7rdU_oApeqk8-O-7OJMIibwaijM-HY/edit#heading=h.969mszu50xv1

According to "Model2" in the discussion the configuration would become:

inputs:
- id: ...
  defaults:
    interface.devices: [...]
    processors:
    - add_fields: ...
  streams:
  - data_stream.dataset: packet.flow
  - data_stream.dataset: packet.http
  - ...

The defaults is a shared set of settings, but streams would be allowed to overwrite settings. Packetbeat would need to create a "sniffer" for each device found. Individual analyzers/streams are configured within the sniffer per stream that matches.

@andrewstucki
Copy link
Author

@urso that would make sense. It is pretty similar to what's already here with the addition of overriding of individual device-level settings per-stream. I like the added flexibility as well

It's a bit funky in the sense that the configuration presented to packetbeat is grouped by package/integration and in the configuration munging code we'd need to regroup inputs logically based off of device-level configuration, but I agree that it keeps the general semantics of inputs = "sniffer"/"device", streams = "protocol" fairly in place.

@andrewstucki
Copy link
Author

Met with @andrewkroh and here's a summary of how we're going to approach this:

  1. We're going to keep the structure of:
inputs:
  - type: packet
    streams:
      - type: flow
  1. On the Packetbeat side, we'll allow streams to specify their own sniffer settings. Initially we'll just couple these together so that we maintain a 1-1 between inputs and sniffer instances.
  2. Eventually we'll decouple the streams and allow each input to correspond to n sniffers based off of the sniffer configurations.
  3. Ideally we'd still go ahead and do what @urso was suggesting and have the ability to have defaults at the input level

This will essentially allow us to move forward with the spec from #22145 and then add the sniffer configuration code to Packetbeat.

For additional details about indexing strategy/package implementation, I think maybe we can have that conversation in #21356?

@ruflin
Copy link
Contributor

ruflin commented Nov 16, 2020

@andrewstucki Could you share a bit background on the pros / cons and how you reached the conclusion to go with 1? This is so we can get back to this thread and have a place to look up not just the decision.

@andrewstucki
Copy link
Author

PROS

  1. It fits the packetbeat 1 sniffer --> multiple dissectors architecture much better by logically grouping the dependent configuration. This also makes sense from a performance standpoint when enabling multiple protocol dissectors, a sniffer per dissector would require processing every packet n times on a system otherwise
  2. It makes sense from a package-implementation context because each package can still specify its own packet-type input and spin up a new sniffer
  3. It keeps the bulk of the configuration (the protocol/flow enablement) compatible with where it resides in packages -- at the stream level

CONS

  1. It makes the implementation of independent sniffer configurations within the same package a bit more difficult due to moving towards point number 3 in the above comment (decoupling the streams)
  2. Specifically, it precludes you from specifying a bunch of different packet/* input types to intentionally segment multiple packetbeat sniffers per package. While I'm not sure why you'd ever need this, having a single packet input means we're leaving a lot more logic to packetbeat to just "do the right thing" rather than having a package maintainer say -- "create 3 sniffers for my 3 different streams"

@ruflin
Copy link
Contributor

ruflin commented Nov 18, 2020

Thanks for the details. In summary from an Elastic Agent perspective, packetbeat is just a single input.

@botelastic
Copy link

botelastic bot commented Oct 27, 2022

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Oct 27, 2022
@botelastic botelastic bot closed this as completed Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants