Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker container does not run on GPU #7324

Open
DiTo97 opened this issue Aug 31, 2024 · 4 comments
Open

docker container does not run on GPU #7324

DiTo97 opened this issue Aug 31, 2024 · 4 comments
Labels
🪳 bug Something isn't working 🔺 re_renderer affects re_renderer itself

Comments

@DiTo97
Copy link

DiTo97 commented Aug 31, 2024

Describe the bug
we have built a docker container pulling from the official NVIDA repository hub:

docker pull nvidia/cuda:11.8.0-devel-ubuntu22.04

and ran the procedure to install rerun following your documentation.

rerun works (ran a few examples), but detects only the CPU as rasterizing device. The container easily sees the GPU (NVIDIA A10) well, as we have other workloads in that container that run on the GPU with no issues at all when invoked.

we have also run rerun in the host machine (AWS EC2) outside the container and it works on the GPU just fine.

@DiTo97 DiTo97 added 👀 needs triage This issue needs to be triaged by the Rerun team 🪳 bug Something isn't working labels Aug 31, 2024
@jleibs
Copy link
Member

jleibs commented Aug 31, 2024

Can you see if the instructions from:

address your issue?

@emilk emilk added 🔺 re_renderer affects re_renderer itself and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Sep 1, 2024
@DiTo97
Copy link
Author

DiTo97 commented Sep 2, 2024

@jleibs, I tried, but could not see any real difference.

@DiTo97
Copy link
Author

DiTo97 commented Sep 4, 2024

@jleibs, what's the roadmap for official docker support?

@jleibs
Copy link
Member

jleibs commented Sep 4, 2024

@jleibs, what's the roadmap for official docker support?

This is not an item on our roadmap at the moment.

"Docker", unfortunately creates a combinatorial explosion of complex host/container environments. In my experience 95% of the issues we have seen have been related to configuration of the Docker or host environment rather than an actual issue with the Rerun view.

We do know that Docker deployments can be made to work.

The instructions from #6835 continue to work for me.

If it is helpful to your debugging, I am running on Arch Linux, using wayland via hyprland, and an RTX 4070

>  nvidia-smi 
Wed Sep  4 17:22:59 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+

My docker, and nvidia-container package versions are:

docker 1:27.2.0-1
libnvidia-container 1.16.1-1

If you have followed the linked instructions verbatim, I would deduce the issue may have something to do with the host nvidia driver version, or even NVIDIA A10 support within driver or container runtime support version.

we have other workloads in that container that run on the GPU

Can you confirm that these are vulkan applications that render to a display? My go-to test application is typicall vkcube. If this finds your GPU but Rerun doesn't, that might help narrow in on the nature of the problem.

All that said, if you are unable to get Rerun to work inside the docker container, I would recommend using the viewer on one of the many supported native environments, or via the web, and just remotely access the data from the docker container without using that environment to do the rendering itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 🔺 re_renderer affects re_renderer itself
Projects
None yet
Development

No branches or pull requests

3 participants