-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to install nvdiffrast on GeForce RTX 3090. #56
Comments
Here is the command line output when running ./run_sample.sh --build-container:
|
If I use the command pip install ., nvdiffrast can be installed, but an error will be reported when executing glctx = dr.RasterizeGLContext(device=device):
|
Target architecture I think you have three options here: The best option is to 1) Use our latest Dockerfile as-is. It uses a base image with up-to-date PyTorch and Cuda versions that support the latest GPUs. The second option is to 2) Install EGL in your host environment. You should use our Dockerfile as a reference on how to do that, as it is not trivial to get a working setup. If everything else fails, you can also 3) Modify the plugin compilation function |
On the other hand, it appears that since version 1.8.0, PyTorch attempts to clamp the architecture to what the installed Cuda toolkit supports (as seen here). Therefore PyTorch 1.8.0 with Cuda 11.1 should in theory work, compiling to architecture So either that clamping logic fails somehow, or there is some other issue preventing the compilation from succeeding. Setting |
Nvdiffrast requires an OpenGL device for executing the rasterization op, and EGL is required for to get an OpenGL context, i.e., to get access to the graphics pipeline of the GPU. The EGL initialization failure suggests that the OpenGL configuration is somehow not functional in your cluster environment. This could perhaps be an issue with permissions, but I don't think that should result in EGL initialization failure. Thus it's probably related to some other part of the cluster configuration, and likely not something you can fix without going through the cluster management. Maybe there are some OS-level Nvidia drivers missing in the cluster machine? |
This is indeed caused by the nvidia driver. The nvidia driver was installed with argument -no-opengl-files before. |
Environment:
data:image/s3,"s3://crabby-images/229ae/229ae0b79b5e01cf3293ce36a8c331f5b124e69b" alt="1638178319(1)"
cuda 11.2
I have tried pytorch:1.8.0-cuda11.1-cudnn8 and pytorch:1.7.1-cuda11.0-cudnn8, but both are failed.
I use the provided Dockerfile with only pytorch and cuda version changed.
The command bash ./run_sample.sh --build-container (or docker build -f docker/Dockerfile -t name:tagname .) can be executed successfully, but after that the nvdiffrast is still not installed (when import nvdiffrast.torch, raise ModuleNotFoundError: No module named 'nvdiffrast.torch').
I have successfully installed nvdiffrast with the same steps on 2080ti GPU+cuda10.2, but failed on 3090 GPU+cuda11.2.
Is there anyone know how to install nvdiffrast on 3090 GPU? Thanks.
The text was updated successfully, but these errors were encountered: