-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running prefect local agent in a docker container leads to zombie apocalypse ;-) #2418
Comments
What I am finding is that each run produces three subprocesses. The process with the smallest PID takes the longest to run as seems to be reaped eventually. The other two processes seem to exit more quickly but are never reaped. Thus each flow run adds net two zombies. |
Congratulations @mcg1969 I think this means that you are patient zero! I will look into this behavior. What are you using as the base image for your container? |
I'm afraid I can't share the exact container, though I don't mind that you know it's the one that we use inside of Anaconda Enterprise, and @jcrist might have some familiarity with that. That said, it's based on a CentOS 7.3 base image, with Miniconda installed within. I'm happy to share the precise conda environment I was using too if that helps. |
No worries! Was only wondering if it had some possible weird dependencies but this is enough information to go off of 😄 |
Here's the conda environment, re-creatable with
The file:
|
I wouldn't say there's anything about the container I would expect to cause problems. Anything is possible, of course. But the container doesn't have an init process. |
I have been able to verify that adding an |
Glad to hear it! Currently it looks like we're implicitly relying on the init process to prune orphaned processes (which IMO is fine, if not ideal). We could possibly fix this in the future, but for now I think I'm fine saying that we require an init process when using the local agent. Leaving it open though. Thanks for the report @mcg1969! |
I think that's reasonable—a doc fix would be great to consider! |
Just adding here from IRL convo: we think the docs note should be on the page describing the local agent. |
Description
I'm running
prefect agent
inside of a Docker container with local execution. Each run of the process leaves a zombie process, a phenomenon which if left unchecked eventually causes deleterious effects. I noticed this because I was at one point unable to ssh into the node on which the container was running.Expected Behavior
Somehow the completed processes should be harvested to remove the zombies.
Reproduction
My shell script does
(note: removing the
exec
doesn't help). Here's a simple script to create a flow that runs on a schedule:Environment
The container is built on CentOS 7.3. It does not have an init process.
The text was updated successfully, but these errors were encountered: