-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing /sys/fs/cgroup/cpuacct,cpu #145
Comments
Note that this reversal only happens within docker. Outside of docker, I see |
@crawford interesting. Thanks for adding that. @derekwaynecarr does this truly look like the same issue you had seen before? |
Do we have a good understanding of how hard this will be to fix? I know @derekwaynecarr noted he's looked at this before and thought it has been fixed already. |
To notify those on this issue work on trying to identify the issue has started. |
Here's what I found so far with Setup$ vagrant box add --name RHCOS rhcos-vagrant-libvirt.box
$ mdkir rhcos && cd rhcos && vagrant init RHCOS && vagrant up
$ vagrant ssh Link to Vagrant box binary: http://aos-ostree.rhev-ci-vms.eng.rdu2.redhat.com/rhcos/images/cloud/latest/ RPM Overlaying $ sudo ostree admin unlock --hotfix
$ rpm -qa | grep docker Docker version 1.13.1-70 RHEL7
Ran the following commands to start the Kublet:
Which gave me the following output:
So it looks the error with Updated: look at comments below, the tests stated in this comment was insufficient to identify the problem |
I've encountered an error when Mounting NFS shared folders, i.e. at The full error log in this gist. |
@Bubblemelon this makes me wonder if the fix was applied at build time via a patch. It may be worth using |
Using this Libvirt howto guide to verify the assumptions in my above comment about docker's Master Node Info RHCOS version: source
Docker Version: 2018-04-30 15:56:58
Output from
In trying to resolve I found this openshift issue #18776: To place
within
|
Related: coreos/bugs#1435 |
The error above
can be resolved by adding
After running,
$ rpm -qa | grep docker
docker-client-1.13.1-63.git94f4240.el7.x86_64
docker-rhel-push-plugin-1.13.1-63.git94f4240.el7.x86_64
docker-common-1.13.1-63.git94f4240.el7.x86_64
docker-1.13.1-63.git94f4240.el7.x86_64
docker-novolume-plugin-1.13.1-63.git94f4240.el7.x86_64
docker-lvm-plugin-1.13.1-63.git94f4240.el7.x86_64 |
Great work debugging @Bubblemelon! |
Also thank you @crawford for helping me! Just to clarify, something on the I've also tried it out with this docker version: source - Sun, 08 Jul 2018 09:39:40 UT docker-1.13.1-72.git6f36bd4.el7.x86_64
docker-rhel-push-plugin-1.13.1-72.git6f36bd4.el7.x86_64
docker-client-1.13.1-72.git6f36bd4.el7.x86_64
docker-lvm-plugin-1.13.1-72.git6f36bd4.el7.x86_64
docker-common-1.13.1-72.git6f36bd4.el7.x86_64
docker-novolume-plugin-1.13.1-72.git6f36bd4.el7.x86_64 Which gave the same error. |
Like to note that That version of kubelet should include this fix |
@derekwaynecarr what are your thoughts on this? |
cadivor doesn't like /sys:/sys:ro. See google/cadvisor#1843 |
This same error,
Still occurs when
Note that on RHCOS, the file is in this format: If both of these were added, under
This error would occur: kubelet.service holdoff time over, scheduling restart.
Starting Kubernetes Kubelet...
Started Kubernetes Kubelet.
container_linux.go:247: starting container process caused "process_linux.go:364: container init caused
\"rootfs_linux.go:54: mounting \\\"/sys/fs/cgroup/cpu,cpuacct\\\" to rootfs
\\\"/var/lib/docker/overlay2/8c95a16f4cad1f014091093c62248c6c0f27bcde879606cef6220f7db4521708/
merged\\\" at \\\"/var/lib/docker/overlay2/8c95a16f4cad1f014091093c62248c6c0f27bcde879606cef6220f7db4521708/
merged/sys/fs/cgroup/cpuacct,cpu\\\" caused \\\"no space left on device\\\"\""
/usr/bin/docker-current: Error response from daemon: oci runtime error: Failed to remove paths:
map[cpu:/sys/fs/cgroup/cpu,cpuacct/system.slice/docker-afc3a2d6c323ed28a6c7e6586239cb4db8b79b591513eb229ca6fa1eb0bead3b.scope
cpuacct:/sys/fs/cgroup/cpu,cpuacct/system.slice/docker-afc3a2d6c323ed28a6c7e6586239cb4db8b79b591513eb229ca6fa1eb0bead3b.scope]. |
@crawford do you mind stating what priority you think this should have? Or if the workaround in use should be applied in the RHCOS spins itself? This would clarify if @Bubblemelon and @mrunalp should keep digging on this specific issue. |
This needs to be fixed in the Kubelet. If the OS team is going to tackle that, then I think this bug should stay. Otherwise, let's close this and let @derekwaynecarr and his team tackle the issue. Either way, this is a low priority. I have a workaround (it's ugly, but it works). |
Since this is kubelet related we should pass it over to @derekwaynecarr's team and link back to this issue so they don't have to re-do all of the good debugging done so far. |
Moved this issue over to openshift/origin |
Closing since the fix must be done in another codebase. |
@crawford has found in tests that
/sys/fs/cgroup/cpuacct,cpu
is being expected during his testing but RHCOS provides/sys/fs/cgroup/cpu,cpuacct
.kubernetes/kubernetes#32728 (comment) denotes a similar issue. The workaround is to setup a link from one to the other.
The text was updated successfully, but these errors were encountered: