-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thaw a paused container in cgroup v1 when it is forcely deleted. #1204
Conversation
If force is not given, and cgroups is v1 and container is frozen, display error saying could not be killed as it was forzen. If force option is given, and cgroups is v1 and container is frozen, thaw it and send the kill signal. If cgroups is v2 , nothing special needs to be done. Fix: youki-dev#1129 Signed-off-by: Chen Yiyang <[email protected]>
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1204 +/- ##
==========================================
- Coverage 69.29% 68.77% -0.53%
==========================================
Files 118 119 +1
Lines 12474 12595 +121
==========================================
+ Hits 8644 8662 +18
- Misses 3830 3933 +103 |
Hey @cyyzero , I was going through implementation of runc of this, and one thing I noticed is that for single or multiple (https://github.com/opencontainers/runc/blob/main/libcontainer/container_linux.go#L370) , they first signal the processes, in the frozen state ( in fact they seem to freeze the cgroup before sending kill signal in case of all https://github.com/opencontainers/runc/blob/1c3b8dbaf440d16653d834b612258bbb28268730/libcontainer/init_linux.go#L530) then send kill signal, and then thaw the cgroup In the PR you seem to thaw the cgroup before sending the kill signal. One of the issues with this is that between the time you get all processes from the cgroup and the time you send signal, there might be additional processes spawned as cgroup is not frozen. The runc's way seems to avoid it. Also once you are done with the initial implementation that is in the current state, can I ask you to add Thank you! |
I might know why integration test TestContainerKillInitPidHost fails. I output command lines of each execution of youki:
TestContainerKillInitPidHost runs the container without creating pid namespace, so the child process will not exit automatically after killing the init process, it is necessary to use I haven't read the containerd or shim-v2 source code , so I'm not quite clear why the kill command is called multiple times. It seems the first In fact, in the runc source code, it doesn't do status check for |
The latest commit seems to be able to pass cc @YJDoc2 |
Hey @cyyzero Thanks a lot for taking time and updating the PR! It is great that more integration tests are passing now! I'll take a look at the code around weekend, and get back here. Thanks :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, great work! I have left some comments, please take a look.
} | ||
} | ||
self.set_status(ContainerStatus::Stopped).save()?; | ||
std::process::exit(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we had exit here in the original implementation... libcontainers is a library and should not call exit unless extreme conditions. Ideally we should let users of the library decide what to do after calling this function. It would be a better idea to return Ok(())
from here. @Furisto if you get time, can you check and let know if there was any particular reasoning behind exiting from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. return Ok(())
is better than exiting here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make that change, where we return the result instead of exiting? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@YJDoc2 Sure😄
@cyyzero Apologies for the delay in my response. I have updated the comments, and if you can change the process exit to return result and add the @utam0k , as this does change the behavior of kill slightly, and adds an Again, apologies to both for the delay on my side 😅 |
I'm planning to release v0.0.4 soon if the contained test is passed with youki after merging this PR, so there is no problem. Thanks for your concern ;) |
To pass containerd integration test TestContainerKillInitPidHost, we need to allow a container to be killed --all again when its status is stopped. Fix: youki-dev#1225 Signed-off-by: Chen Yiyang <[email protected]>
https://github.com/containers/youki/blob/6d05dd2f60198aaecb7c93d08c08d7db4fbc6600/crates/liboci-cli/src/kill.rs#L5-L11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 👍
Hey @cyyzero Thanks for your contribution and efforts ! :) |
If force is not given, and cgroups is v1 and container is frozen, display error saying could not be killed as it was forzen. If force option is given, and cgroups is v1 and container is frozen, thaw it and send the kill signal. If cgroups is v2 , nothing special needs to be done.
Fix: #1129
Signed-off-by: Chen Yiyang [email protected]