Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Enhancements for rate_limit processor #23276

Closed
3 of 4 tasks
andresrc opened this issue Dec 24, 2020 · 8 comments
Closed
3 of 4 tasks

Possible Enhancements for rate_limit processor #23276

andresrc opened this issue Dec 24, 2020 · 8 comments
Labels
Stalled Team:Platforms Label for the Integrations - Platforms team

Comments

@andresrc
Copy link
Contributor

andresrc commented Dec 24, 2020

Describe the enhancement:

Some possible future improvements:

  • Throttle if rate > x
  • drop if rate > y.
  • Some way of knowing that messages have been dropped. E.g. a message in the logs saying rate limit has been reached. I can see us debugging why some logs are missing not realizing they hit the rate limit. (#22883)
  • Metric of how many events have been dropped due to rate limit that can be seen in stack monitoring (e.g. to find unexpected drops, or know when to adjust limits) (#23330)

Describe a specific use case for the enhancement or feature:

More information about what the rate processor is doing.

@andresrc andresrc added the Team:Platforms Label for the Integrations - Platforms team label Dec 24, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@ycombinator
Copy link
Contributor

ycombinator commented Dec 28, 2020

Some way of knowing that messages have been dropped. E.g. a message in the logs saying rate limit has been reached. I can see us debugging why some logs are missing not realizing they hit the rate limit.

The processor will log a debug-level message in the Beat's log whenever an event is dropped.

p.logger.Debugf("event [%v] dropped by rate_limit processor", event)

@matschaffer
Copy link
Contributor

matschaffer commented Jan 12, 2021

So we'd probably have to have something ingesting the filebeat log itself then (and be ingesting debug events).

Is there a plan to emit some sort of stack monitoring counter? Seems like that'd be easier to spot when it happens (via alert on monitoring cluster).

For context I'd still treat throttling as a critical event to take care of (figure out what's wrong with the service that's' causing the log spam, fix it in an upcoming release).

But throttling in place would help avoid killing elasticsearch and having to clean up that problem in addition to the spammy logger.

@ycombinator
Copy link
Contributor

Is there a plan to emit some sort of stack monitoring counter?

Yes, this is exactly the plan: #23330 (comment). I have a PR open that I'm hoping to resume work on tomorrow. Don't pay attention to the PR title and description as it stands now; I will update those once I update the PR.

@matschaffer
Copy link
Contributor

Awesome, thanks!

@masci
Copy link

masci commented Jan 26, 2021

@ycombinator I see the PR got merged, what's the status of the issue?

@ycombinator
Copy link
Contributor

I think this issue is still okay to keep open. There is one improvement mentioned in the description that is not yet implemented, viz.:

  • Throttle if rate > x, drop if rate > y.

We currently implement the latter (drop) but not yet the former (throttle). Throttling might require larger changes to the event processing pipeline in libbeat.

The other two improvements mentioned in the description have been implemented:

  • Some way of knowing that messages have been dropped. E.g. a message in the logs saying rate limit has been reached. I can see us debugging why some logs are missing not realizing they hit the rate limit.

Was done as part of the initial PR implementing the rate_limit processor: #22883

  • Metric of how many events have been dropped due to rate limit that can be seen in stack monitoring (e.g. to find unexpected drops, or know when to adjust limits)

Was done in a follow up PR: #23330

@botelastic
Copy link

botelastic bot commented Jan 27, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added the Stalled label Jan 27, 2022
@masci masci closed this as completed Jan 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stalled Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

No branches or pull requests

5 participants