-
Notifications
You must be signed in to change notification settings - Fork 862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we remove CUDA dependencies from nccl_net.h? #310
Comments
Indeed we could probably split them into two parts. Alternatively, since the API is versioned, you can copy the definitions from nccl_net.h and nccl.h in your project, and remove what you don't need. The definitions won't change in the version you choose to implement. When you want to support a new version, you can update your includes to the new definitions (or even add a new version of the struct along the other if you want to support multiple versions). |
This allows us to remove CUDA dependencies from nccl_net as described in NVIDIA#310.
I sent a pull request implementing a version of this, let me know if this works for you. Copying the file into our projects works, but at Google we prefer to track upstream repositories as much as possible, so if you think that this PR (or something similar, open to suggestions!) works, we'd prefer to get that upstreamed. |
Thanks. The patch looks good to me. We're pretty busy on other things right now so it might take us a week or two before we can look into this; let me know if you need this merged quicker than that. Also don't hesitate to ping us again if nothing happened in two weeks from now. |
That's alright, I can keep the work on my side. Should I also send PRs for the 2.6 branch? |
It would be indeed better to base your patches on 2.6. But hopefully, given those parts didn't change much in 2.6, the rebase should be easy. |
This allows us to remove CUDA dependencies from nccl_net as described in NVIDIA#310.
Currently
nccl_net.h
pullsnccl.h
only for using ncclResult_t, as far I understand. However, nccl.h pulls in <cuda_runtime.h> and <cuda_fp16.h>, which means that plugins can only be built in machines with the CUDA headers available.We could remove this dependency by splitting
nccl.h.in
into two:types.h.in
andnccl.h
. The first would contain the general type definitions (everything above the collective comms ops), and the second would include the first and contain the rest of the file. With this split, onlynccl.h
needs the CUDA headers.This way we can make the .in file smaller, separate API from types, and allow
nccl_net.h
to include onlytypes.h
without pulling cuda headers in. What do y'all think?The text was updated successfully, but these errors were encountered: