-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
td-agent memory usage gradually creeping upwards #1414
Comments
It means out_secure_forward can't flush buffer to destination and it causes growing memory usage? And how about dentry cache? |
I have confirmed that force flushing the buffer (sending
Can you clarify what you mean by this? This is precisely the problem. Memory usage increases gradually over the course of several days to the point of consuming most of the memory on the server.
How do I check this? Thanks for your quick response! |
When flush the buffer?
Your secure_forward setting uses memory buffer. It means the secure_forward can't flush the buffer chunks, memory usage is growing unlike file buffer.
Googling it is faster than my comment ;) |
I have just force flushed the buffer: No error was thrown. Memory usage is at ~2.17 GB.
The So that then prompts another question: what is usually the cause of I guess this would probably be more of an issue for https://github.com/tagomoris/fluent-plugin-secure-forward I just want to confirm that the root cause of the issue is Thanks again! |
This is very hard question. Check secure-forward plugin issues is better. |
Ok great, thanks again. |
BTW, if you want to reduce the ruby's memory usage itself, set |
Thanks! |
This is reoccurring, and I have disabled secure-forward (using elasticsearch output plugin with no SSL). Furthermore the logs no longer contain any I am restarting with verbose logging and the issue is reproducible and will almost certainly reoccur, so please let me know how to proceed, and which logs and diagnostic info to provide. If possible, can we move this to a more private channel, or can I email you the full logs? Some may potentially contain sensitive information and I would like to provide you with a full set of logs instead of the excerpts I was able to post above. Thanks again. |
The new
and from the logs:
|
Also from the logs, the only interesting messages are as follows:
etc... and
and then a
(notice the time of the message) and then more of the above sorts of messages (no errors or anything). |
Memory usage still growing. Here is an error that came up in the logs:
|
@tagomoris may be related to #1434? |
Also a |
#1467 may fix this problem. |
I can and will try it soon. Sorry for the delay, have been very busy lately. Does that same issue exist in 0.12? |
I'm not sure but I didn't receive the report from users. |
Closed. if you have same problem with latest fluentd v0.14, please reopen it. |
See also #1384 (seems similar):
- fluentd or td-agent version.
fluentd-0.14.10
- Environment information, e.g. OS.
Debian GNU/Linux 8.6 (jessie)
3.16.0-4-amd64
- Your configuration
(please excuse the possibly poor or nonstandard config – I am inheriting this from another developer and new to fluentd)
- Your problem explanation. If you have an error logs, write it together.
We have td-agent running on several mesos agents, tailing various log files, etc. (from the conf you will observe that we use the
*
format inpath
) – it usually is tailing a large number of files that often get created and then subsequently deleted (but never rotated).On some agents, td-agent memory usage mysteriously begins growing over the course of several days, to the point where it begins using a ridiculous amount of memory and needs to be killed. Sending a
SIGTERM
viaservice restart
usually works but takes some time (~10 mins), and memory usage returns to normal upon restart. I can recreate the problem (and it is currently occurring), so let me know what further diagnostic information to provide and I will be happy to help. Also, once again, please forgive the poor configuration, as I said earlier I am inheriting this from someone else.I am including logs and diagnostic information for both a low mem usage (normal operation) td-agent and a high mem usage (faulty operation) td-agent for ease of comparison:
low mem td-agent.log:
https://gist.github.com/hjet/329ff5abe38efbf5b68c55328a6925a1
high mem td-agent.log:
https://gist.github.com/hjet/cf39a32146edffbf61b651dff482e6e5
monitor_agent (for both):
https://gist.github.com/hjet/769e581918b69a3031657a7bfbf7dfb3
sigdumps (low mem usage):
https://gist.github.com/hjet/6b46311466d2487592db3f9feb1e0279
sigdumps (high mem usage):
https://gist.github.com/hjet/1fbead9a120fc5efb05fdc66cefee8c6
strace (low mem usage):
https://gist.github.com/hjet/b6c969196ea1a6488559c84aeb442175
strace (high mem usage):
https://gist.github.com/hjet/866c17ab9e8891dcb00c446f60ca3c28
perf (low mem usage):
perf (high mem usage):
pid2line.rb (low mem usage):
pid2line.rb (high mem usage):
Please let me know what other information I can provide to help!
Also not sure if the
error="no one nodes with valid ssl session"
is part of the problem (flushing the buffer did not alleviate memory pressure) – so some insight there would be appreciated as well (I am planning on fixing this issue at the same time).Thank you!
The text was updated successfully, but these errors were encountered: