-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch Thrust to workaround CUDA_CUB_RET_IF_FAIL macro clearing CUDA errors #6098
Conversation
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
The same change has been posted to Thrust as NVIDIA/thrust#1264. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jlowe!
Codecov Report
@@ Coverage Diff @@
## branch-0.15 #6098 +/- ##
===============================================
- Coverage 84.52% 84.51% -0.02%
===============================================
Files 82 82
Lines 13835 13835
===============================================
- Hits 11694 11692 -2
- Misses 2141 2143 +2
Continue to review full report at Codecov.
|
This updates the libcudf build to patch Thrust after it is fetched. There's a bug in the
CUDA_CUB_RET_IF_FAIL
macro that can cause errors to be cleared when called with acudaPeekAtLastError()
argument which is very common in the thrust codebase.cub::Debug
will unconditionally clear the last CUDA error, so when the macro argument gets evaluated twice it ends up returning no error after the CUDA error was cleared, preventing the application from seeing the CUDA error.