Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker is required for container runtime even though I am using containerd #2364

Closed
brianmay opened this issue Dec 15, 2020 · 29 comments
Closed
Labels
area/UX kind/feature Categorizes issue or PR as related to a new feature.
Milestone

Comments

@brianmay
Copy link

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT

Versions

kubeadm version (use kubeadm version):

root@kube-master:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.0", GitCommit:"af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38", GitTreeState:"clean", BuildDate:"2020-12-08T17:57:36Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version): 0.20.0
  • OS (e.g. from /etc/os-release): Debian/buster
  • Kernel (e.g. uname -a): 4.19.0-13-amd64

What happened?

kubadm upgrade node tries to run docker, but I have switched to containerd:

root@kube-master:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.0", GitCommit:"af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38", GitTreeState:"clean", BuildDate:"2020-12-08T17:57:36Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
root@kube-master:~# kubeadm config images pull 
[config/images] Pulled k8s.gcr.io/kube-apiserver:v1.20.0
[config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.20.0
[config/images] Pulled k8s.gcr.io/kube-scheduler:v1.20.0
[config/images] Pulled k8s.gcr.io/kube-proxy:v1.20.0
[config/images] Pulled k8s.gcr.io/pause:3.2
[config/images] Pulled k8s.gcr.io/etcd:3.4.13-0
[config/images] Pulled k8s.gcr.io/coredns:1.7.0
root@kube-master:~# kubeadm upgrade node 
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W1215 11:40:34.109125   26843 kubelet.go:200] cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': executable file not found in $PATH
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: docker is required for container runtime: exec: "docker": executable file not found in $PATH
To see the stack trace of this error execute with --v=5 or higher

What you expected to happen?

"kube upgrade node" like "kubeadm config images pull" should run cri commands, not docker commands.

I think the "docker info" part is related to #2270 - but in that case it is a warning only.

But It looks like the last message is a hard error.

@neolit123
Copy link
Member

neolit123 commented Dec 15, 2020

your workaround is to skip the phase for now --skip-phases=preflight

kubeadm config images pull

i just tried removing the docker binary and this command fails for me too.
kubeadm config images pull has no logic to detect if you wish to use crictl (containerd) unless you pass --config (with the socket value) or --cri-socket. i don't see how this command is passing in your case and you should be getting the same error

"docker": executable file not found in $PATH

kubeadm upgrade node

the kubeadm upgrade node currently has no way to accept a CRI socket (e.g. via --cri-socket), so it currently just defaults to the docker socket and therefore defaults to using the docker CLI.

one solution here is to fetch what CRI socket is on this Node object, but this means we need to know the node name.

kubectl get no controlplane -o yaml | grep cri
    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock

the alternative is to require the user to pass --cri-socket if they want to use a container runtime != docker.

@neolit123
Copy link
Member

neolit123 commented Dec 15, 2020

we did remove the --cri-socket flag for upgrade apply with the argument it should be fetched from the cluster:
kubernetes/kubernetes#85044
#1356

so it seems appropriate to fetch it from the Node object.

cc @fabriziopandini @SataQiu WDYT?

BTW @SataQiu looks like this wasn't a sufficient fix:
kubernetes/kubernetes#94555

... or instead of fetching the Node cri-socket, we may have to apply CRI socket detection here:
https://github.com/kubernetes/kubernetes/blob/89ba90573f163ee3452b526f30348a035d54e870/cmd/kubeadm/app/cmd/upgrade/node.go#L148
and here:
https://github.com/kubernetes/kubernetes/blob/0b92e8b16d00712594493072710f81b8d37ce623/cmd/kubeadm/app/cmd/upgrade/common.go#L76

/kind feature
/area ux

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. area/UX labels Dec 15, 2020
@neolit123
Copy link
Member

neolit123 commented Dec 15, 2020

@brianmay i'd assume this is a problem for upgrade apply too?
(except that upgrade apply allows passing an InitConfiguration.NodeRegistrationOptions.CRISocket via --config)

@neolit123 neolit123 added this to the v1.21 milestone Dec 15, 2020
@brianmay
Copy link
Author

Yes, this is on upgrades. As above, looks like there might be a work around via the --config parameter. Will try asap.

@brianmay
Copy link
Author

Sorry, ignore my previous response. I was getting confused.

What is the difference between "upgrade apply" and "upgrade node"? Can I pass a config file that only sets CRISocket or do I need to set all the other values too?

@rsoika
Copy link

rsoika commented Dec 29, 2020

I have the same issue when calling the print-join-command.
I installed kubernetes v1.20.1 on Debian with containerd. THe master and worker nodes work.
But if i call on the master node:

kubeadm token create --print-join-command

I got:

W1229 21:59:35.779624   18425 kubelet.go:200] cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': executable file not found in $PATH
kubeadm join 10.0.0.2:6443 --token 4fd7i0.v8kddn2yfrgtp544     --discovery-token-ca-cert-hash sha256:.........

I did not really understand all the discussion about this warning. Should we ignore this? Joining a worker node to the master is working fine - even without docker daemon installed. And the cluster seems to work.

@brianmay
Copy link
Author

brianmay commented Dec 30, 2020

@rsoika I believe that warning can be ignored. It is the hard error I was getting that cannot be ignored.

I am hoping I might be able to resolve this without converting my cluster back to docker... But so far everyone seems to be rather quiet on the subject of a solution or even a workaround.

Unless of course kubeadm 1.20.1 has made any changes to fix this?

@neolit123
Copy link
Member

we should fix this after the holidays.

@brianmay
Copy link
Author

@neolit123 Great news, thanks.

@pacoxu
Copy link
Member

pacoxu commented Dec 31, 2020

after I remove /var/run/docker-shim.sock and /var/run/docker.sock, the command works.

@neolit123
Copy link
Member

neolit123 commented Dec 31, 2020

removing /var/run/docker*.sock is actually a good solution. when no config file is passed to a command kubeadm (with an explicit socket) and if the docker socket is present on the host it will take priority.

@brianmay
Copy link
Author

In my case I don't have anything that matches /var/run/docker*, or anything that looks like these sock files. I think I might have deleted them already.

I do have a /var/lib/docker and a /var/lib/dockershim/ (it is safe to delete these??) but I am a little bit skeptical these directories are confusing kubeadm.

@AleksandrNull
Copy link

AleksandrNull commented Jan 4, 2021

Here is a KISS workaround:
echo '#!/bin/sh' > /sbin/docker && chmod 0100 /sbin/docker
and run kubeadm upgrade apply or kubeadm upgrade node command. Don't forget to delete /sbin/docker after upgrade was completed :)

@brianmay
Copy link
Author

brianmay commented Jan 4, 2021

@AleksandrNull So I guess this means that the docker call's aren't actually required for the upgrade to work? If so, good to know.

@jeanluclariviere
Copy link

@AleksandrNull So I guess this means that the docker call's aren't actually required for the upgrade to work? If so, good to know.

I was hitting this exact error when trying to use kubeadm upgrade apply. @AleksandrNull solution worked perfectly and I was able to upgrade my dev cluster to 1.19.5 this morning.

@AleksandrNull
Copy link

@brianmay That's correct. It basically checks docker binary and trying to pre-pull images. Pulling images (using docker) is absolutely useless with containerd runtime as default so this "mock" does no harm.

@neolit123
Copy link
Member

neolit123 commented Jan 8, 2021

@brianmay i tested and looked at the code today, it looks fine.

my guess is that you switched to containerd but the CRI socket on that Node object remains for docker.
what is the output of this command on that particular Node?

kubectl get no controlplane -o yaml | grep kubeadm.alpha.kubernetes.io/cri-socket

if you patch/edit the kubeadm.alpha.kubernetes.io/cri-socket value the kubeadm command should work.

kubeadm does not really support switching container runtimes on the fly or similar reconfiguration during upgrade...
we are in the process of writing guides around in-place container runtime replacement.

please check this discussion:
kubernetes/website#25787 (review)

and watch this ticket:
kubernetes/website#25879

@neolit123
Copy link
Member

this PR should make all commands that don't need the container runtime to not check for running docker or crtctl:
kubernetes/kubernetes#97625

@neolit123
Copy link
Member

neolit123 commented Jan 8, 2021

@pacoxu would you have time to backport your PR to 1.18, 1.19, 1.20?

@pacoxu
Copy link
Member

pacoxu commented Jan 8, 2021

@neolit123 ok let me do it

@brianmay
Copy link
Author

brianmay commented Jan 8, 2021

So for every control plane node I get:

$ kubectl get no kube-master -o yaml | grep kubeadm.alpha.kubernetes.io/cri-socket
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock

Can I confirm that this - or similar- is the correct command to fix (for every control plane node):

kubectl annotate node master kubeadm.alpha.kubernetes.io/cri-socket unix:///run/containerd/containerd.sock

If I had known that you were still writing migration documentation, I might have waited. It is perhaps unfortunate that docker-shim was announced as deprecated, you should migrate over, etc, before the documentation was complete. And often projects don't bother with upgrade instructions :-(.

But regardless, thanks for the references supplied above to the PR and issue.

For the record the migration was relatively straight forward. Nothing on my system depends on Docker. Except the CNI file, which has somewhat painful to work out, particularly as I am using dual IPv4 and IPv6, and need to supply multiple subnets. Supposedly the auto-generated file was suppose to appear in my logs from before the migration, but I looked and looked and couldn't find it. I think I worked it out, but my solution does involve hard coding the nodes' subnet ranges. IIRC I tried "usePodCidr" for the IPv4 subnet and got loud objections. Would be nice if I didn't have to do this. But is acceptable for this cluster.

 {
	"cniVersion": "0.3.1",
	"name": "kubenet",
  	"type": "bridge",
    	"bridge": "cbr0",
    	"mtu": 1500,
    	"isGateway": true,
    	"ipMasq": true,
    	"hairpinMode": false,
    	"ipam": {
  		"type": "host-local",
       		"ranges": [
       		   [
       		       { "subnet": "10.1.0.0/16" }
       		   ],
       		   [
       		       { "subnet": "fc00:1::/32" }
       		   ]
       		],
        	"routes": [
			{ "dst": "0.0.0.0/0" },
			{ "dst": "::/0" }
		]
    	}
}

@neolit123
Copy link
Member

If I had known that you were still writing migration documentation, I might have waited. It is perhaps unfortunate that docker-shim was announced as deprecated, you should migrate over, etc, before the documentation was complete.

i warned about this on the deprecation PR:
kubernetes/kubernetes#94624 (comment)

@neolit123
Copy link
Member

kubectl annotate node master kubeadm.alpha.kubernetes.io/cri-socket unix:///run/containerd/containerd.sock

we should include this in the migration guide (TBD).

For the record the migration was relatively straight forward. Nothing on my system depends on Docker. Except the CNI file, which has somewhat painful to work out, particularly as I am using dual IPv4 and IPv6, and need to supply multiple subnets. Supposedly the auto-generated file was suppose to appear in my logs from before the migration, but I looked and looked and couldn't find it. I think I worked it out, but my solution does involve hard coding the nodes' subnet ranges. IIRC I tried "usePodCidr" for the IPv4 subnet and got loud objections. Would be nice if I didn't have to do this. But is acceptable for this cluster.

i didn't have to do this when i tried migrating docker -> containerd but it was a single stack v4.
we can consult with SIG Network if this becomes a common issue.

closing as the main issue is explained and the side issues were addressed by PR.
/close

@k8s-ci-robot
Copy link
Contributor

@neolit123: Closing this issue.

In response to this:

kubectl annotate node master kubeadm.alpha.kubernetes.io/cri-socket unix:///run/containerd/containerd.sock

we should include this in the migration guide (TBD).

For the record the migration was relatively straight forward. Nothing on my system depends on Docker. Except the CNI file, which has somewhat painful to work out, particularly as I am using dual IPv4 and IPv6, and need to supply multiple subnets. Supposedly the auto-generated file was suppose to appear in my logs from before the migration, but I looked and looked and couldn't find it. I think I worked it out, but my solution does involve hard coding the nodes' subnet ranges. IIRC I tried "usePodCidr" for the IPv4 subnet and got loud objections. Would be nice if I didn't have to do this. But is acceptable for this cluster.

i didn't have to do this when i tried migrating docker -> containerd but it was a single stack v4.
we can consult with SIG Network if this becomes a common issue.

closing as the main issue is explained and the side issues were addressed by PR.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@brianmay
Copy link
Author

brianmay commented Jan 9, 2021

Revised the above command:

kubectl annotate node kube-master --overwrite kubeadm.alpha.kubernetes.io/cri-socket=unix:///run/containerd/containerd.sock

It looks good now.

@KeithTt
Copy link

KeithTt commented Aug 10, 2022

Revised the above command:

kubectl annotate node kube-master --overwrite kubeadm.alpha.kubernetes.io/cri-socket=unix:///run/containerd/containerd.sock

It looks good now.

I think this command need to communicate with apiserver, if apiserver is stopped how to update the value of kubeadm.alpha.kubernetes.io/cri-socket.

I refer to this guide: https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/change-runtime-containerd/

  1. stop kubelet
  2. configure and start containerd
  3. configure kubelet to use containerd
    • update the file /var/lib/kubelet/kubeadm-flags.env
    • kubectl edit no <node-name> to change the value of kubeadm.alpha.kubernetes.io/cri-socket from /var/run/dockershim.sock to /var/run/containerd/containerd.sock

The problem is that the kubelet is already stopped, and the apiserver pod is also stopped, I can't run kubectl edit no <node-name> to update the node infomation.

@neolit123
Copy link
Member

The problem is that the kubelet is already stopped, and the apiserver pod is also stopped, I can't run kubectl edit no to update the node infomation.

for single control plane clusters this can be a problem, yes. you can log an issue in kubernetes/website about it. the annotation can be safely edited before the kubelet is stopped.

@KeithTt
Copy link

KeithTt commented Aug 10, 2022

the annotation can be safely edited before the kubelet is stopped.

Thanks a loooooot. Confused for a few days.

@aimcod
Copy link

aimcod commented Apr 11, 2023

I created this script to migrate from docker to containerd. This works on oracle linux and rocky linux.
Note the upper to lower converstion of the hostname, as kubernetes does that.
image

kubectl drain $(hostname | tr '[:upper:]' '[:lower:]') --ignore-daemonsets &
echo waiting 30 seconds for node $(hostname) to be drained...
sleep 30
sudo systemctl stop kubelet
sudo systemctl disable docker --now
sudo yum remove docker-ce docker-ce-cli -y
sudo modprobe overlay
sudo modprobe br_netfilter
sudo cat << EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
sudo cat << EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sudo sysctl --system
sudo yum install containerd -y
sudo mkdir  -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo sed -i 's/disabled_plugins.*/#disabled_plugins/' /etc/containerd/config.toml
sudo systemctl restart containerd
sudo sed -i.bak 's/KUBELET_KUBEADM_ARGS=\".*/KUBELET_KUBEADM_ARGS=\"--container-runtime=remote  --container-runtime-endpoint=unix:\/\/\/run\/containerd\/containerd.sock\"/' /var/lib/kubelet/kubeadm-flags.env

kubectl patch no  $(hostname | tr '[:upper:]' '[:lower:]') --patch '{"metadata": {"annotations": {"kubeadm.alpha.kubernetes.io/cri-socket": "unix:///run/containerd/containerd.sock"}}}'

sudo systemctl start kubelet

kubectl uncordon  $(hostname | tr '[:upper:]' '[:lower:]')

kubectl get nodes -o wide

This has worked for me, these past few days, as I am working on automating this task, as well as upgrading the entire system to rocky linux 9 along with kubernetes, to the latest version.

I think that due to this post, I finally know why my kubernetes upgrades were failing.
Big thanks to @AleksandrNull

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/UX kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

9 participants