You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
We run multiple kubernetes pods on the same node. Under low traffic there is generally not a problem but under high load when we generate 128MB size of logs file every 2 minutes fluentd starts to skip the logs. There are gaps in log ingestion. I tracked the last event and the first event in the logs and they correspond to the last log line of the file and the first log line of the next log file that it picks up, all other log files in between gets skipped. We have tons of CPU power available, ruby process is only at 50%.
We've tried almost all the settings described in other related thread, like watcher settings, rotate wait settings but nothing seems to help.
Have 2 similar pods on a 8 core node, drive load to generate 128 MB file size very 2 minutes. At this time you'll see POD A logs are ingested but no logs from POD B, and then after few mts POD B logs are ingested but no logs from POD A, and this behavior continues.
***** Only thing that worked is if I hard code individual containers in multiple tail input sections using multiple workers. This solution is not a viable solution because container ids are not known ahead of time.
Expected behavior
All the logs are forwarded without skipping them Your Environment
fluentd 1.4
Amazon Linux 2
Fluentd or td-agent version: fluentd --version or td-agent --version
Operating system: cat /etc/os-release
Kernel version: uname -r
If you hit the problem with older fluentd version, try latest version first.
Your Configuration
<!-- Write your configuration here -->
Your Error Log
No errors are noted. No warnings as well
<!-- Write your **ALL** error log here -->
Additional context
The text was updated successfully, but these errors were encountered:
Describe the bug
We run multiple kubernetes pods on the same node. Under low traffic there is generally not a problem but under high load when we generate 128MB size of logs file every 2 minutes fluentd starts to skip the logs. There are gaps in log ingestion. I tracked the last event and the first event in the logs and they correspond to the last log line of the file and the first log line of the next log file that it picks up, all other log files in between gets skipped. We have tons of CPU power available, ruby process is only at 50%.
We've tried almost all the settings described in other related thread, like watcher settings, rotate wait settings but nothing seems to help.
To Reproduce
Have 2 similar pods on a 8 core node, drive load to generate 128 MB file size very 2 minutes. At this time you'll see POD A logs are ingested but no logs from POD B, and then after few mts POD B logs are ingested but no logs from POD A, and this behavior continues.
***** Only thing that worked is if I hard code individual containers in multiple tail input sections using multiple workers. This solution is not a viable solution because container ids are not known ahead of time.
Expected behavior
All the logs are forwarded without skipping them
Your Environment
fluentd 1.4
Amazon Linux 2
fluentd --version
ortd-agent --version
cat /etc/os-release
uname -r
If you hit the problem with older fluentd version, try latest version first.
Your Configuration
Your Error Log
No errors are noted. No warnings as well
Additional context
The text was updated successfully, but these errors were encountered: