-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvidia-container-cli: initialization error on Ubuntu22.04LTS #250
Comments
Question: How is docker and docker compose installed? We have seen strange behaviour when these are installed using snaps. Looking at the error message it seems as if the NVIDIA Container CLI cannot load the NVML library |
Thanks for your reply. I installed docker engine by following the steps from docker docs. After that, I install the docker desktop by following the instructions here. I didn't use snaps. |
Sorry @Lonitch, I thought I had asked, but does running the container without
|
@elezar Thanks for your reply. It does not work unfortunately. Still gives |
Just reinstalled Ubuntu20.04 LTS on the machine, and the same error still spins out. I wonder if it has something to do with my dual graphic card(2 Nvidia RTX). |
@Lonitch have you tried installing drivers by issuing This worked for me. |
@Lonitch since docker complains with:
What are the contents of your Note that this configuration is irrespective of the driver or the GPUs that you have installed. |
@elezar Maybe I can help give context here, I'm seeing the same issue here, both on Ubuntu and Arch. The cause is rooted somewhere between Docker Desktop and the Nvidia runtime. On fresh installs of Arch and Ubuntu (20.04 and 22.04 LTS), following the documentation for Docker/Docker Desktop/CUDA/NCT consistently results in the error originally posted here, but only when using the Steps to reproduce:
At this point, there are two docker contexts installed.
Any GPU-related image only succeeds if you
I'm at my wit's end with this problem. I have probably spent 40 hours the past week trying to solve this specific issue, including countless full re-installations of multiple distros. If you'd like some more info, I can provide it. I'm considering opening a new ticket since I have exact STR and it clearly extends beyond the scope of the original post. |
Also having this issue. Can confirm that it seems to be an issue with docker desktop and not the docker ce install |
Encountered the same issue for days on Ubuntu 18.04.6 LTS when running a docker container with GPUs, e.g. and the error shows The gpu is NVIDIA GeForce RTX 3090 with the enabled persistence mode. Referred to some suggestions from
, but still have no positive results. Was wondering if you have any other insights? |
@allisontw are you also using Docker Desktop? This is not currently supported on Linux. |
Thanks @elezar for your reply and hint! In my case, it seems not installed with Docker Desktop but docker engine related packages/configurations somehow broken since I discovered that the versions of docker server and docker client are inconsistent. Not sure my assumption is correct, but I can work with docker and nvidia gpus correctly when the versions of docker server and docker client are the same in the past, so I removed docker related packages completely, installed them again, and the issue was solved. Hope this solution could help some others as well. |
Thanks. I reinstalled docker compeletly, and it works. |
Could you maybe elaborate a bit about the exact steps you took? Thanks! |
Hi there,
I recently wanted to build containers that can run GUI applications. My
Dockerfile
anddocker-compose.yml
work well in WSL2, but I ran into problems when building the same container in Ubuntu 22.04LTS. MyDockerfile
looks like the following:And
docker-compose.yml
looks likeWhen I run
docker compose up
, the following error pops up:--------Steps I've taken so far--------
My theory was something wrong with the Nvidia runtime, so I added
before
command
in thedocker-compose.yml
above, and whendocker compose up
again, I have the following errorNext, I followed the steps listed here to add the runtime using
which shows the following outputs:
I try to add the runtime using systemd drop-in file, but the error persists even after I reboot the machine.
nvidia-container-cli -k -d /dev/tty info
nvidia-smi -a
docker version
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
nvidia-container-cli -V
And comments on these info? Thank you!
The text was updated successfully, but these errors were encountered: