-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: question about how Multi-GPU is intended to be used #11
Comments
Update: have checked torch with multi-GPU is configured and running correctly by running this tutorial notebook locally: https://colab.research.google.com/github/pytorch/tutorials/blob/gh-pages/_downloads/63ecfdf27b96977f3a89015f60065a2b/data_parallel_tutorial.ipynb |
I think this is the line to blame.
It won't run multi-GPU inference if 3D inference has been run previously without the option checked. Can you try running with multi-GPU first thing after launching the 3D inference widget? That worked for me. I've successfully run inference on a system with 4 GPUs before, but performance can be unstable. Using multi-GPU inference typically only makes sense for large volumes because there can be a significant initialization overhead associated with creating the process group. |
Ok, I'll try that next week when I next have time booked on that machine.
How large is large, to you? |
Multi-GPU is much more stable now. See the note in https://empanada.readthedocs.io/en/latest/plugin/best-practice.html#inference-best-practices for usage advice. |
I have a multi-GPU workstation, but have found that only one GPU is being used when I select both "Use GPU" and "Multi GPU" for 3D inference.
My question is, is there anything specific the user needs to do to run multi-GPU jobs? Other than, load a large 3D image into napari, then start empanada-napari 3D inference with "Use GPU" & "Multi GPU" selected (and possibly also give it a zarr directory)?
Torch is able to access multiple GPUs within my environment:
But the NVIDIA GPU utilization graph shows only one active.
The only information I can find in the docs just says that this is an experimental feature:
I'm not very familiar with the code in multigpu.py. I could look at that a bit more to try and work it out. I could also double check torch is working well with some less complex multi-GPU examples.
Thanks for any advice you might have here
The text was updated successfully, but these errors were encountered: