-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Container Timeout #6412
Comments
Would the container get the SIGSTOP (Actually --stop-signal value) signal or SIGKILL? If I restart the container, does it run for another 30 seconds? |
@mheon @vrothberg @baude WDYT? |
We would need to wire this into conmon. since it is the only thing left running when you run with --detatch. |
This sounds completely reasonable to me. |
My concern would be the complexity of stopping the container exclusively from within Conmon. I don't really want to implement the full logic of |
theoretically, we could also pass conmon a list of args for a podman call, like we do for exit command |
another thing to note is that conmon doesn't technically know when a container starts. It knows when the container starts logging things, and starts behaving like it's started, but this is not precise. We'd have to have podman send data down a pipe to tell conmon "hey, I started the container", and then we'd start the timeout. |
I don't see this as an --rm-timeout. I don't think --rm is required. |
Doesn't conmon get the contents of the --stop-signal? |
Podman forwards the It is therefore my understanding that
This is a singleton container running untrusted code. We never want to restart it. That would allow the untrusted code to persist across the timeout. |
@vrothberg Could you take this one on? |
This looks like larger chunk of work as it spans across libpod and conmon. I think that I should rather work on the parallel-copy detection over in c/image. WDYT? |
Sounds good, we can give this to @ashley-cui @QiWang19 @ParkerVR or @ryanchpowell Or anyone else who wants to grab it. |
A friendly reminder that this issue had no activity for 30 days. |
@QiWang19 PTAL |
A friendly reminder that this issue had no activity for 30 days. |
@QiWang19 Did you get a chance to look at this? |
I haven't started working on this now but can add it to my list |
A friendly reminder that this issue had no activity for 30 days. |
Hi folks, I'd also love to have this feature for a system running long-running number-crunchy jobs. |
Interested in opening a PR for this feature? |
I'm not sure I understand the architecture good enough to know where to start. Does this go into podman? Conmon? |
Both. You would need a way to trigger the command within podman. Basically add a --timeout flag (and maybe --timeout-signal), that conmon would know to kill the container. This would cause conmon to send run with --timeout 20m And after 20 minutes, conmon would send a stop signal to pid1, and 10 seconds later send the kill signal. |
After staring at the code for a couple of hours, I'm still not quite sure where I'd pass a new flags.UintVar(&runOpts.Timeout, "timeout", 0, "Stop the container after [timeout] seconds, or 0 to not time out the container") ends up generating a I also still have no idea where in conmon I'd send the stop signal to the container's pid 1. That's about all the time I had to spend on something I can script with a timed callback to run |
Some first issues are meatier then others, thanks for trying. |
If nobody picked it up by the next time I have a day or two to spare I might give it another shot. 🙂 |
Sounds good. |
A friendly reminder that this issue had no activity for 30 days. |
@kblin Did you ever get a chance to look at this? |
I didn't have the time to spare yet. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
This option allows users to specify the maximum amount of time to run before conmon sends the kill signal to the container. Fixes: containers#6412 Signed-off-by: Daniel J Walsh <[email protected]>
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind feature
Description
I want to use
podman run --rm ...
to run a container that is removed on exit. I also want the container to be forcibly killed if it is still running aftern
seconds.Yes, I can write a podman state manager to call
podman stop
. But that requires me to do a lot of work. It is also racy and bug prone.I could use the
timeout
command (from coreutils) to send a signal. But podman has only two modes for signal handling. The default mode forwards the signal to pid 1 in the container. However, pid 1 could just ignore the signal to bypass the timeout. If I use--sig-proxy=false
then podman doesn't forward the signal to the container. But it also doesn't stop the container. Therefore, the container bypasses the timeout.I tried looking at the
--timeout
option forconmon
, but that doesn't do what we need.I see two ways forward.
Add support for
--sig-proxy=stop
. This mode would not proxy the signal to the container and would instead terminate the container (and, implicitly, do the--rm
). Then timeout support could be implemented using thetimeout
utility from coreutils.Add a new option for
--timeout=n
which would causepodman run --rm --timeout=30
to forcibly shut down the container and remove it after 30 seconds. I think I would prefer this option since it doesn't require another process.The text was updated successfully, but these errors were encountered: