-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
starting container process caused 'process_linux.go:245: running exec setns process for init caused "exit status 6"' #1130
Comments
The To cut a long story short, this is the code that is failing: /*
* We must fork to actually enter the PID namespace, and use
* CLONE_PARENT so that the child init can have the right parent
* (the bootstrap process). Also so we don't need to forward the
* child's exit code or resend its death signal.
*/
childpid = clone_parent(env, config->cloneflags);
if (childpid < 0)
bail("unable to fork"); /* this is where exit status 6 comes from */ So, the big question is -- does your system support all of the namespaces that you're trying to use? What is the output of |
Ah, that helps explain the exit status, cheers. What's odd here is that the failure was not consistent; sometimes the The node degraded further and won't even let me
But that node does not seem to hit the same issue as the first one; all services seem to have their containers start up fine. I'll attach the info from |
@hkjn Actually, the best thing would be for you to attach an |
@cyphar |
@rajasec That's because you're trying to unshare namespaces you don't have the right to unshare. You'll have to take a look at the kernel code to figure out precisely what's happening (if you're trying to run |
+1 have this error and don't use any runC for anything (though it might be used inside Mono). It also happens intermittently but mostly when the machine is tight on resources / overloaded. Any other tips for debugging root cause if Im not using RunC? |
I have this error with docker (I assume docker-runc?). Not sure how I would debug it. Give me something to type and I'll type it? |
Some information that would be useful from anyone else who comments on this issue:
|
No user namespaces. I'll have to tool around with it. I don't get a container following the runc readme. Doing something daft I expect. |
@jamiethermo You can create a container like this:
Does that help? |
Ok. That works. |
Alright, it would help to know what
Then try to start a container. It will fail, but you should be able to get the Then runC should fail to start. Paste the config you got here. |
Ok. Can't do that right now. But since it seems arbitrary what is running and what is failing (the same docker image will run one minute and not the next), here's a config file that did get created. Don't know if that'll help. Will try the hack, above, tomorrow. Thanks! |
For people who get "exit status x",you can get the runc code you are using, then:
Then you can find out which It's ugly though, we should improve it someday. |
@hqhq Or you can count from the start of the file (which is what I do). Vim even has a shortcut for it. But yes, the |
@cyphar Could I replace docker-runc with a bash script that saves off the config.json somewhere if it crashes? Could we make runc do that by default? |
You could try that. By the way, if you haven't created an upstream bug report (in Docker) please do so.
I don't want to, mainly because it'd only be helpful for debugging things in certain cases under Docker. And runC is not just used inside Docker. |
ECS team thinks this issue is causing their agent to disconnect at times. Referenced aws/amazon-ecs-agent#658 (comment) |
I "fixed" by upgrading from Ubuntu 15.04 -> 16.04. It might be a bug in an
old version that is no longer maintained.
…On Wed, Feb 1, 2017 at 6:24 PM, James Yang ***@***.***> wrote:
ECS team thinks this issue is causing their agent to disconnect at times.
Referenced aws/amazon-ecs-agent#658 (comment)
<aws/amazon-ecs-agent#658 (comment)>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1130 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACI6mXS1bSYH35c_Dv020e6jfsrqfnrEks5rYRRLgaJpZM4Kbwol>
.
--
Jared Broad
|
hm might have to try that |
@cyphar is there a workaround for this? besides upgrading to ubuntu 16? |
@jamesongithub It's likely that issues of this form are kernel issues (and since Ubuntu has interesting kernel policies, upgrading might be your only option), unless you have some very odd configurations. As I mentioned above, the error only tells us what line inside |
I've been having this issue with RHEL 7.3 too Besides being inexperienced with stuff like ns and runc, I'm struggling to figure out what's going on because it's intermittent as mentioned by @jamesongithub
|
@cyphar @rhatdan Same issue on RHEL 7.4, but exit status is 40, user namespace is enabled as per this doc: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html/getting_started_with_containers/get_started_with_docker_formatted_container_images#user_namespaces_options. On latest available kernel. |
For anyone having issues with RHEL only enable this option: namespace.unpriv_enable=1 and not this user_namespace.enable=1 having both in cmdline causes issues:
|
I came here from google for a similar error. Turns out, I was trying to use the
You have to, instead, do this:
Note also (and somewhat unrelated) that I was getting similar errors on Fedora simply related to SELinux. And while I don't recommend doing the following for security reasons (see: http://stopdisablingselinux.com/), it did work for me:
|
I meet the same problem, when I build and start a image.
Then I clean the a lot of images and containers and free the caches, the problem is disappear. But I think is not a cache problem because of the change of cache is tiny. |
seems related to: |
It is bug of kernel(3.10.0-327),try to update your kernel version. |
Hi OCI folks,
We are seeing a failure to start Docker containers through
runc
, seemingly from this line:This might well be a config or system issue (we're on somewhat old Kernel versions because CentOS..), but the logs don't give so much to go on here..
The
man
pages forsetns
is defining the error codes it should return:But if the following page can be trusted,
exit status 6
should beENXIO
, which is not mentioned in theman
pages:Any suggestions for how to debug further or what to check would be appreciated, thanks in advance!
Logs
System info
The text was updated successfully, but these errors were encountered: