-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cleanup of /run/crun state dir when cgroup failed to mount #1456
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
src/libcrun/status.c
Outdated
@@ -485,7 +486,20 @@ rmdirfd (const char *namedir, int fd, libcrun_error_t *err) | |||
ret = unlinkat (dirfd (d), de->d_name, 0); | |||
if (ret < 0) | |||
{ | |||
retry_unlink: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure but indentation looks off here, could you do clang-format
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the CI fails for clang-format
as well, could you please run clang-format
and repush it. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not used/didn't notice this style for goto labels, fixed.
Note make clang-format
modifies quite a few other files, might want to do a "fix formatting" commit at some point
749bf3f
to
39ab155
Compare
I think we should rebase on top of #1457 once it is merged |
When running a container with /sys/fs/cgroup already mounted (e.g. --volume=/sys:/sys) and another shared volume somewhere (e.g. --volume=/mnt:/mnt:shared), the move mount fails with EINVAL: ``` mount("/run/crun/6d75ef05e540486f21ef54743f8af97f4dff0d0d7c54d9b7852cd5302e884db8/tmpmount", "/proc/self/fd/8", NULL, MS_MOVE, NULL) = -1 EINVAL (Invalid argument) ``` This leaves tmpmount as a mountpoint behind, which then fails to cleanup: ``` openat(AT_FDCWD, "/run/crun", O_RDONLY|O_CLOEXEC|O_DIRECTORY) = 3 openat(3, "0d1342f7a7375593813287c743170bb71f30f9a64fa8bc141d3cf9b8e9aa5a89", O_RDONLY|O_DIRECTORY) = 4 unlinkat(4, "tmpmount", 0) = -1 EISDIR (Is a directory) unlinkat(4, "tmpmount", AT_REMOVEDIR) = -1 EBUSY (Device or resource busy) close(4) = 0 unlinkat(3, "0d1342f7a7375593813287c743170bb71f30f9a64fa8bc141d3cf9b8e9aa5a89", AT_REMOVEDIR) = -1 ENOTEMPTY (Directory not empty) ``` This in turn ultimately fails libcrun_container_delete_status() and friends, leading to the non-execution of post hooks in this reproducer: ``` $ mkdir hooks.d $ cat > hooks.d/hook.json <<'EOF' { "version": "1.0.0", "when": {"always": true}, "hook": { "path": "/bin/sh", "args": ["/bin/sh", "-c", "date >> /tmp/hookme"] }, "stages": ["poststop"] } EOF $ podman --runtime=path/crun run \ --net=none --name test -d --replace --hooks-dir=$PWD/hooks.d \ --volume=/sys:/sys --volume=/mnt:/mnt:shared \ docker.io/alpine true ``` (the hook is eventually run on podman rm, but not by crun immediately on container stop) Just cleaning up the mount point makes the problem go away. Fixes: 523eed3 ("linux: add new fallback when mount fails with EBUSY") Signed-off-by: Dominique Martinet <[email protected]>
The previous patch left a mount point in /run/crun which could not be removed, we should try harder to remove directories failing with EBUSY as we do in create_missing_devs(). This patch also fixes the issue described in the previous commit ("status: rmdirfd: try harder to remove mount points"). Fixes: 523eed3 ("linux: add new fallback when mount fails with EBUSY") Signed-off-by: Dominique Martinet <[email protected]>
39ab155
to
4618d50
Compare
Done, tests now pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This fixes a regression introduced in crun 1.9.1 with 523eed3
When running a container with /sys/fs/cgroup already mounted (e.g. --volume=/sys:/sys) and another shared volume somewhere (e.g. --volume=/mnt:/mnt:shared), the move mount fails with EINVAL:
This leaves tmpmount as a mountpoint behind, which then fails to cleanup:
This in turn ultimately fails libcrun_container_delete_status() and friends, leading to the non-execution of post hooks in this reproducer:
(the hook is eventually run on podman rm, but not by crun immediately on container stop)
This PR has two commits, either of which fix this issue:
happy to drop either commit, but I think it's worth keeping the safety as such mount points are easy to forget.