-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pause container doesn't restart when docker restarts causing dependent containers to fail #10556
Comments
Hi @wesgur thanks for reporting. I was able to reproduce this and am including some steps below as well as an example of it working normally without bridge networking. I'll talk with the team to explore possible solutions and provide an update shortly. jobfile
failure
non bridge pass
|
Thanks @drewbailey for the quick follow up. Would you be able to provide me with the job file that you used for |
Sure thing, it was just the equivalent of - mode = "bridge" |
Thanks for that. Really appreciate you looking into this. I have tried using the fall back network mode ( |
@wesgur I'm unable to provide an exact time frame. The team is hoping to address it in the next minor release, depending on prioritization and capacity. For a CNI work around you may want to check out port map plugin here: https://www.cni.dev/plugins/current/meta/portmap/ I'd also look into host network mode as well since that is the default. |
Just by way of follow-up, the root technical cause here is that for tasks with When But the pause container is not run as a Nomad task. I suspect there are two reasons for this:
Some approaches that might work here, most of which introduce some risks of backwards compatibility problems:
|
Close by #15732. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Operating system and Environment details
Issue
I have a a nomad job with a task group running in network mode
bridge
. When we restart docker, the job fails to come up with the following error:The task is successfully allocated when I run the job initially. When docker is restarted, both the pause container and task container are failing.
Reproduction steps
systemctl restart docker
)Expected Result
Nomad job succeeds. Task and pause container both restarted properly.
Actual Result
Nomad job fails to allocate. Both task and pause container are failing.
Job file (if appropriate)
The text was updated successfully, but these errors were encountered: