Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are GPU-enabled container runnable with containerd runtime? #1239

Closed
Bamfax opened this issue May 27, 2020 · 10 comments
Closed

Are GPU-enabled container runnable with containerd runtime? #1239

Bamfax opened this issue May 27, 2020 · 10 comments

Comments

@Bamfax
Copy link

Bamfax commented May 27, 2020

GPU-enabled pods are failing to start using the gpu addon and trying to stay with containerd (not using docker as default runtime). The cuda-vector-add testing pod remains in pending state and does not start. Looking at the logs of nvidia-device-plugin-daemonset, it also has errors and mentions that the default runtime needs to be changed from containerd to docker.

I would prefer to stay with the containerd runtime and avoid using docker-ce as runtime (nvidia-docker2 depends upon docker-ce), due to other aspects (docker changing iptables rules, which requires a further solution kubernetes/kubernetes#39823 (comment) and #267)

Seeing reports the k3s is able to run gpu-enabled pods using containerd (https://dev.to/mweibel/add-nvidia-gpu-support-to-k3s-with-containerd-4j17) and that my OS-level containerd is able to run a pod with nvidia-smi I prefer to stay with containerd runtime. Is that somehow possible?

The system on which microk8s is being run on is a Debian Buster 10.4 with Nvidia drivers from Debian Backports and Nvidia docker libraries from nvidia.github.io. Microk8s was installed via snap.

# ldconfig -p | grep cuda
        libicudata.so.65 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libicudata.so.65
        libicudata.so.63 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libicudata.so.63
        libcudart.so.10.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcudart.so.10.1
        libcudart.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcudart.so
        libcuda.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so.1
        libcuda.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so
# microk8s enable gpu
Enabling NVIDIA GPU
NVIDIA kernel module detected
dns is already enabled
Applying manifest
daemonset.apps/nvidia-device-plugin-daemonset created
NVIDIA is enabled
# microk8s status
microk8s is running
addons:
dashboard: enabled
dns: enabled
gpu: enabled
metallb: enabled
rbac: enabled
registry: enabled
storage: enabled
cilium: disabled
[...]
# cat cuda-vector-add_test.yaml
apiVersion: v1
kind: Pod
metadata:
  name: cuda-vector-add
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vector-add
      image: "k8s.gcr.io/cuda-vector-add:v0.1"
      resources:
        limits:
          nvidia.com/gpu: 1
# kubectl create -f cuda-vector-add_test.yaml
pod/cuda-vector-add created
# kubectl get all -A | grep cuda
default              pod/cuda-vector-add                                   0/1     Pending   0          47s
# kubectl describe pod/cuda-vector-add
Name:         cuda-vector-add
Namespace:    default
Priority:     0
Node:         <none>
Labels:       <none>
Annotations:  <none>
Status:       Pending
IP:
IPs:          <none>
Containers:
  cuda-vector-add:
    Image:      k8s.gcr.io/cuda-vector-add:v0.1
    Port:       <none>
    Host Port:  <none>
    Limits:
      nvidia.com/gpu:  1
    Requests:
      nvidia.com/gpu:  1
    Environment:       <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-s9mfz (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  default-token-s9mfz:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-s9mfz
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  2s (x5 over 4m4s)  default-scheduler  0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
# kubectl -n kube-system logs pod/nvidia-device-plugin-daemonset-slqpv
2020/05/29 07:59:50 Loading NVML
2020/05/29 07:59:50 Failed to initialize NVML: could not load NVML library.
2020/05/29 07:59:50 If this is a GPU node, did you set the docker default runtime to `nvidia`?
2020/05/29 07:59:50 You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites
2020/05/29 07:59:50 You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start

Looking further where this may be rooted in, I used microk8s.ctr to try start the pod directly and compare with another ctr/containerd. "microk8s.ctr" using containerd runtime "nvidia-container-runtime" throws a libnvidia-container.so.1 error. In contrary, everything works fine doing the same using "ctr" directly (different ctr/containerd outside microk8s, docker 1.2.5 deb used here)

#microk8s ctr run --rm --gpus 0 docker.io/nvidia/cuda:9.0-base nvidia-smi nvidia-smi
ctr: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"process_linux.go:385: running prestart hook 0 caused \\\"error running hook: exit status 127, stdout: , stderr: /usr/bin/nvidia-container-cli: relocation error: /usr/bin/nvidia-container-cli: symbol nvc_device_mig_caps_mount version NVC_1.0 not defined in file libnvidia-container.so.1 with link time reference\\\\n\\\"\"": unknown

Is the above error related to the issue mentioned on https://github.com/NVIDIA/k8s-device-plugin#prerequisites, or would gpu-enabled pods be runnable in the given setup?

Note that you need to install the nvidia-docker2 package and not the nvidia-container-toolkit. This is because the new --gpus options hasn't reached kubernetes yet. You will need to enable the nvidia runtime as your default runtime on your node.

The pod starts fine using the non-microk8s ctr/containerd:

#ctr run --rm --gpus 0 docker.io/nvidia/cuda:9.0-base nvidia-smi nvidia-smi
Wed May 27 13:25:29 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2060    On   | 00000000:07:00.0 Off |                  N/A |
|  0%   44C    P8    18W / 160W |      0MiB /  5934MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
# cat /etc/apt/sources.list.d/nvidia-docker.list
deb https://nvidia.github.io/libnvidia-container/debian10/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/debian10/$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/debian10/$(ARCH) /
# apt show libcuda1
Package: libcuda1
Version: 440.82-1~bpo10+1
Priority: optional
Section: non-free/libs
Source: nvidia-graphics-drivers
Maintainer: Debian NVIDIA Maintainers <[email protected]>
Installed-Size: 17.0 MB
Provides: libcuda-10.0-1, libcuda-10.1-1, libcuda-10.2-1, libcuda-5.0-1, libcuda-5.5-1, libcuda-6.0-1, libcuda-6.5-1, libcuda-7.0-1, libcuda-7.5-1, libcuda-8.0-1, libcuda-9.0-1, libcuda-9.1-1, libcuda-9.2-1, libcuda.so.1 (= 440.82), libcuda1-any
Pre-Depends: nvidia-legacy-check (>= 396)
Depends: nvidia-support, nvidia-alternative (= 440.82-1~bpo10+1), libnvidia-fatbinaryloader (= 440.82-1~bpo10+1), libc6 (>= 2.7)
Recommends: nvidia-kernel-dkms (= 440.82-1~bpo10+1) | nvidia-kernel-440.82, nvidia-smi, libnvidia-cfg1 (= 440.82-1~bpo10+1), nvidia-persistenced, libcuda1-i386 (= 440.82-1~bpo10+1)
Suggests: nvidia-cuda-mps, nvidia-kernel-dkms (>= 440.82) | nvidia-kernel-source (>= 440.82)
Homepage: https://www.nvidia.com/CUDA
Download-Size: 2,295 kB
APT-Manual-Installed: yes
APT-Sources: http://deb.debian.org/debian buster-backports/non-free amd64 Packages
Description: NVIDIA CUDA Driver Library
# apt show nvidia-container-runtime
Package: nvidia-container-runtime
Version: 3.2.0-1
Priority: optional
Section: utils
Maintainer: NVIDIA CORPORATION <[email protected]>
Installed-Size: 2,021 kB
Depends: nvidia-container-toolkit (>= 1.1.0), nvidia-container-toolkit (<< 2.0.0), libseccomp2
Homepage: https://github.com/NVIDIA/nvidia-container-runtime/wiki
Download-Size: 612 kB
APT-Manual-Installed: yes
APT-Sources: https://nvidia.github.io/nvidia-container-runtime/debian10/amd64  Packages
Description: NVIDIA container runtime
 Provides a modified version of runc allowing users to run GPU enabled
 containers.
@Bamfax
Copy link
Author

Bamfax commented May 27, 2020

Modifying the relevant files (for sake of testing, switched containerd to 1.3.4) in /var/snap/microk8s/current/args/ (ctr, kubelet, containerd and containerd.toml) to use containerd from the host keeps the error appearing.

# kubectl get nodes -o wide
NAME    STATUS     ROLES    AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION   CONTAINER-RUNTIME
name    NotReady   <none>   4d8h   v1.18.2   01.02.03.04   <none>        Debian GNU/Linux 10 (buster)   4.19.0-9-amd64   containerd://1.3.4
# microk8s ctr run --rm --gpus 0 docker.io/nvidia/cuda:9.0-base nvidia-smi nvidia-smi
ctr: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 127, stdout: , stderr: /usr/bin/nvidia-container-cli: relocation error: /usr/bin/nvidia-container-cli: symbol nvc_device_mig_caps_mount version NVC_1.0 not defined in file libnvidia-container.so.1 with link time reference\\\\n\\\"\"": unknown

@balchua
Copy link
Collaborator

balchua commented May 28, 2020

@Bamfax you can enable gpu in microk8s by simply executing microk8s enable gpu. It should do all the necessary changes, and then deploy the nvidia daemonset.
After that you can run gpu enabled pods.

@Bamfax
Copy link
Author

Bamfax commented May 29, 2020

@balchua sorry if I did not point that out at the beginning: I was not able to get gpu-enabled containers running at all so far (using containerd). Enabling the gpu addon seems to work fine, kmods are detected. Then trying to run the cuda-vector-add testcontainer it remains in pending state, with "insufficient gpu".

Adding more detail at the the first post in this issue.

@ktsakalozos
Copy link
Member

Could you share the manifest of the cuda-vector-add testcontainer?

We use the manifest in [1], is it possible the "limits" are not the ones we have.

[1] https://github.com/ubuntu/microk8s/blob/feature/ha-dqlite/tests/templates/cuda-add.yaml

@Bamfax
Copy link
Author

Bamfax commented May 29, 2020

Thanks. I added missing details and the manifast in the first post. The manifest should have been identical. I retried using [1] to be sure, gives the same result.

@Bamfax Bamfax changed the title GPU-enabled containers start using ctr directly, but error when using microk8s.ctr Are GPU-enabled container runnable with containerd runtime? May 29, 2020
@balchua
Copy link
Collaborator

balchua commented May 29, 2020

I tested this when i was upgrading containerd to 1.3 and it was spinning up the pod in [1].

@balchua
Copy link
Collaborator

balchua commented May 29, 2020

@Bamfax
Copy link
Author

Bamfax commented May 29, 2020

@balchua the library conflict was spot on, many thanks. I purged the four packages mentioned and reinstalled microk8s, enabling just the gpu addon. Checking /var/snap/microk8s/current/args/kubelet config, it is set so use remote/containerd sock. Happy to confirm that containerd is indeed working fine. Many thanks for the help.

[...default install...]

# microk8s enable gpu
Enabling NVIDIA GPU
NVIDIA kernel module detected
Enabling DNS
Applying manifest
serviceaccount/coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
clusterrole.rbac.authorization.k8s.io/coredns created
clusterrolebinding.rbac.authorization.k8s.io/coredns created
Restarting kubelet
DNS is enabled
Applying manifest
daemonset.apps/nvidia-device-plugin-daemonset created
NVIDIA is enabled
# kubectl -n kube-system logs pod/nvidia-device-plugin-daemonset-d4mmv
2020/05/29 17:08:34 Loading NVML
2020/05/29 17:08:34 Fetching devices.
2020/05/29 17:08:34 Starting FS watcher.
2020/05/29 17:08:34 Starting OS watcher.
2020/05/29 17:08:34 Starting to serve on /var/lib/kubelet/device-plugins/nvidia.sock
2020/05/29 17:08:34 Registered device plugin with Kubelet
# kubectl apply -f cuda-vector-add_test.yaml
pod/cuda-vector-add created
# kubectl delete pod cuda-vector-add
# kubectl run cuda-vector-add --image "k8s.gcr.io/cuda-vector-add:v0.1" --tty -i
If you don't see a command prompt, try pressing enter.
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
# kubectl run cuda-base --image "docker.io/nvidia/cuda:latest" --tty -i
If you don't see a command prompt, try pressing enter.
root@cuda-base:/# nvidia-smi
Fri May 29 18:02:57 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2060    On   | 00000000:07:00.0 Off |                  N/A |
|  0%   35C    P8    13W / 160W |      0MiB /  5934MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Also when trying to use "microk8s ctr" again as above, it does not find containerd in the PATH. So somehow my native-OS containerd seems to have been picked up before.

Afterwards to reproduce, I reinstalled each package one by one on the host and reran the cuda container after each install.

  1. apt install libnvidia-container1: all fine
  2. apt install libnvidia-container-tools: all fine
  3. apt install nvidia-container-toolkit: cuda container fails as below:
# kubectl run --rm cuda-base --image docker.io/nvidia/cuda:latest -t -i
If you don't see a command prompt, try pressing enter.
root@cuda-base:/# nvidia-smi
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.

The cuda container starts working again as soon as nvidia-container-toolkit is deinstalled from base OS.

Again, many thanks for the help!

@balchua
Copy link
Collaborator

balchua commented May 29, 2020

Checking /var/snap/microk8s/current/args/kubelet config, it is set so use remote/containerd sock.

Does it mean you are not using the containerd that comes with microk8s?
Sorry am a bit confused. 😁

@Bamfax
Copy link
Author

Bamfax commented May 29, 2020

Microk8s in its default install now, all standard. Installed as described in the previous post, with the standard install command:
snap install microk8s --classic --channel=1.18/stable

My /var/snap/microk8s/current/args/kubelet then has this config, that is what I meant with "set so use remote/containerd sock"

--container-runtime=remote
--container-runtime-endpoint=${SNAP_COMMON}/run/containerd.sock
--containerd=${SNAP_COMMON}/run/containerd.sock

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants