-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filebeat : Cloud foundry input dropping logs #18202
Comments
Pinging @elastic/integrations (Team:Integrations) |
Pinging @elastic/integrations-platforms (Team:Platforms) |
Switched to running it locally without any cloudfoundry processor. Used a Ran filebeat for 1hr which should have resulted in 1440 (as its 2 log messages every 5 secs), but it only resulted in 1417 events, making it 23 events off. Looking at the metrics output from filebeats you can see that on average 12 events are published in the 30 seconds, which is the correct amount. But there are some cases in the 30 second window that some only published 10 or 11. This is not an issue of windowing as the previous or next window was not greater than 12. Adding those missing events in the windows equals the exact number of events that should have been captured.
10 event window (2 missing): 11 event window (1 missing):
Below is the window metric data:
|
I was able to reproduce this issue from PCF without using beats at all. I created https://github.com/blakerouse/pcf-log-to-file and determined that just using go-loggregator to a file shows the same missing events. So the issue is not because of our implementation or an issue inside of libbeat. Its coming from the source when using The bug mentions that firehose-to-syslog doesn't have this issue, that is because the released version of that firehose is using the v1 API and not the v2 API. If you compile firehose-to-syslog from master it shows the same behavior that filebeat shows. There is currently a bug filed with PCF to see if they can determine the issue on their side. |
Description : The cloud foundry input is dropping logs, in a repeatable, reproducible manner.
Comparing Systems over 24 hours yields these results for a sample "ticker" app that produces a controlled cadence of log events.
syslog : 34,558 of 34,560 = 99.99421%
Firehose to syslog : 34,557 of 34,560 so 99.99131%
Filebeat : 34,319 of 34,560 so 99.30266%
FIlebeat needs to be as good or better than Firehose to syslog, i.e. at least 4 "9s"
Steps to reproduce:
Deploy app that emits a regular cadence of logs such as
https://github.com/bvader/scheduler
Deploy filebeat
Deploy firehose-to-syslog from:
https://github.com/cloudfoundry-community/firehose-to-syslog
Run and observe:
The black line is the correct / accurate number of logs via firehose-to-syslog the green line shows the dropped logs via filebeat.
Example in detail : scheduler-5s creates 2 logs every 5s = 24 events / Min
6 Hours = 8640 Events which can be observed below
However with filebeat only 8,582 events are capture during the exact same time frame.
NOTE : This same defect likely applies to metricbeat.
The text was updated successfully, but these errors were encountered: