feat: group memory.stats sock metric #3642
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds the cgroup stat
sock
from thememory.stats
metric tocAdvisor.
The motivation is that we've seen numerous examples at DBC Digital of
application developers creating applications that exhaust socket memory,
e.g. by accidentally creating too many TCP connections and not closing
them, or keeping around a few large allocations, or many other such
issues.
Because cAdvisor currently doesn't report socket memory usage, this has
been hard to monitor, and will only be observed when the OOM killer is
reached.
By adding this metric, it will be possible to proactively handle socket
memory exhaustion (which is really kernel memory exhaustion), before it
becomes a potential incident, and to create alerting and enhance
Signed-off-by: Christina Sørensen [email protected]
Notice: I've been unable to figure out how to regenerate the snapshot tests,
I've opened an issue #3632 for this, but have yet to recieve any replies.
I'm hoping making this PR will bring more attention to this change, so it can
recieve feedback.