-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] cgroup v1: optimize code to avoid parsing mountinfo if possible #2438
Conversation
f6fe9ba
to
102659c
Compare
I haven't done any measurements yet but this should be a relatively big win for cgroup v1 container start performance. |
102659c
to
407e631
Compare
Before this patch, runc run parses mountinfo 116 times. With this commit, the number goes down to 62. Which is still a lot. |
b5cc7a3
to
3106475
Compare
I'm still working on that, so let's make this a draft. |
f810dac
to
286035a
Compare
3f555f0
to
a455619
Compare
I think this one is ready but it depends on #2411 which is still not merged. |
In case cgroupPath is under the default cgroup prefix, let's try to guess the mount point by adding the subsystem name to the default prefix, and resolving the resulting path in case it's a symlink. In most cases, given the default cgroup setup, this trick should result in returning the same result faster, and avoiding /proc/self/mountinfo parsing which is relatively slow and problematic. Be very careful with the default path, checking it is - a directory; - a mount point; - has cgroup fstype. If something is not right, fall back to parsing mountinfo. While at it, remove the obsoleted comment about mountinfo parsing. The comment belongs to findCgroupMountpointAndRootFromReader(), but rather than moving it there, let's just remove it, since it does not add any value in understanding the current code. Signed-off-by: Kir Kolyshkin <[email protected]>
When paths are already set, we only need to place the PID into proper cgroups, and we already know all the cgroups path. The fs function d.path(), and the systemd v1 function getSubsystemPath() are both parsing /proc/self/mountinfo, and the only reason they are used here is to check whether the subsystem is available. Use a much simpler check instead. Frankly, I am not sure why the check is needed at all. Maybe we just need to drop it altogether. Also, for fs, since d is no longer used in this code path, moves its initialization to after it. Signed-off-by: Kir Kolyshkin <[email protected]>
We call joinCgroups() from Apply, and in there we iterate through the list of subsystems, calling getSubsystemPath() for each. This is expensive, since every getSubsystemPath() involves parsing mountinfo. At the end of Apply(), we iterate through the list of subsystems to fill the m.paths, again calling getSubsystemPath() for every subsystem. As a result, we parse mountinfo about 20 times here. Let's find the paths first and reuse m.paths in joinCgroups(). Signed-off-by: Kir Kolyshkin <[email protected]>
In all these cases, getSubsystemPath() was already called, and its result stored in m.paths map. It makes no sense to not reuse it. Signed-off-by: Kir Kolyshkin <[email protected]>
It does not add any value. Signed-off-by: Kir Kolyshkin <[email protected]>
In here, defer looks like an overkill, since the code is very simple and we already have an error path. Signed-off-by: Kir Kolyshkin <[email protected]>
Instead of iterating over m.paths, iterate over subsystems and look up the path for each. This is faster since a map lookup is faster than iterating over the names in Get. A quick benchmark shows that the new way is 2.5x faster than the old one. Note though that this is not done to make things faster, as savings are negligible, but to make things simpler by removing some code. Signed-off-by: Kir Kolyshkin <[email protected]>
All the test cases are doing the same checks, only input differs, so we can unify those using a test data table. While at it: - use t.Fatalf where it makes sense (no further checks are possible); - remove the "XXX" comments as we won't get rid of cgroup Name/Parent. PS I tried using t.Parallel() as well but it did not result in any noticeable speedup, so I dropped it for simplicity. Signed-off-by: Kir Kolyshkin <[email protected]>
I think this was added during the dark ages where cgroupfs can be mounted to /cgroup, /dev/cgroup, or some other path. Now it have finally settled on /sys/fs/cgroup, so let's remove the guessing and use the standard path. Signed-off-by: Kir Kolyshkin <[email protected]>
a455619
to
5aef675
Compare
m.paths = make(map[string]string) | ||
if c.Paths != nil { | ||
for name, path := range c.Paths { | ||
_, err := d.path(name) | ||
cgMap, err := cgroups.ParseCgroupFile("/proc/self/cgroup") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a check to see whether the subsystem is available. I am not quite sure in which scenario such a check is required. Maybe we need to just drop it.
} | ||
return err | ||
if _, ok := cgMap[name]; ok { | ||
paths[name] = path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
// Even if it's `not found` error, we'll return err | ||
// because devices cgroup is hard requirement for | ||
// container security. | ||
if s.Name() == "devices" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was moved here from joinCgroups()
(see below) so we bail out as early as in the old code in case of error.
@@ -233,48 +240,25 @@ func (m *legacyManager) Path(subsys string) string { | |||
return m.paths[subsys] | |||
} | |||
|
|||
func join(c *configs.Cgroup, subsystem string, pid int) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the body of this function (modulo getSubsystemPath
) was inlined into joinCgroups
(see below)
OK, apparently this one is way too big to review comfortably. Let me split it in a few smaller and (hopefully) more digestible PRs. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Split into a few smaller PRs; the description of this one is updated to list them all. |
Seems that only #2494 is left (and therefore now equivalent to this PR, but more up-to-date?) |
@h-vetinari this PR was split into 7 different ones, of which 5 are now merged, 6th got rewritten and merged, and yes you're right, the only one left is #2494. Once merged I'll close this one (which I ended up using to track the merge progress). |
Everything's merged now, closing |
This set of patches brings down a number of times we parse /proc/self/mountinfo during container start+stop:
(there is some improvement for other operations as well but I haven't measured those)
Ideally I'd like to bring it down to less than 5 per runc run, but this is already a super good improvement.
Update: this is being splitted into multiple smaller PRs to facilitate easier reviewing. The following list is just for my convenience to make sure I haven't forgot anything.
And here is the list of PRs this one was split into: