Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch Thrust to workaround CUDA_CUB_RET_IF_FAIL macro clearing CUDA errors #6098

Merged
merged 3 commits into from
Aug 27, 2020

Conversation

jlowe
Copy link
Contributor

@jlowe jlowe commented Aug 26, 2020

This updates the libcudf build to patch Thrust after it is fetched. There's a bug in the CUDA_CUB_RET_IF_FAIL macro that can cause errors to be cleared when called with a cudaPeekAtLastError() argument which is very common in the thrust codebase. cub::Debug will unconditionally clear the last CUDA error, so when the macro argument gets evaluated twice it ends up returning no error after the CUDA error was cleared, preventing the application from seeing the CUDA error.

@jlowe jlowe added libcudf Affects libcudf (C++/CUDA) code. 4 - Needs Review Waiting for reviewer to review or respond labels Aug 26, 2020
@jlowe jlowe requested a review from a team as a code owner August 26, 2020 20:30
@jlowe jlowe self-assigned this Aug 26, 2020
@GPUtester
Copy link
Collaborator

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@jlowe jlowe changed the base branch from branch-0.16 to branch-0.15 August 26, 2020 20:31
@GPUtester
Copy link
Collaborator

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@kkraus14 kkraus14 added ! - Hotfix Hotfix is a bug that affects the majority of users for which there is no reasonable workaround ! - Release 3 - Ready for Review Ready for review by team labels Aug 26, 2020
@jlowe
Copy link
Contributor Author

jlowe commented Aug 26, 2020

The same change has been posted to Thrust as NVIDIA/thrust#1264.

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jlowe!

@kkraus14 kkraus14 requested a review from a team August 26, 2020 21:11
@kkraus14 kkraus14 added 5 - Merge After Dependencies and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Aug 26, 2020
@codecov
Copy link

codecov bot commented Aug 26, 2020

Codecov Report

Merging #6098 into branch-0.15 will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.15    #6098      +/-   ##
===============================================
- Coverage        84.52%   84.51%   -0.02%     
===============================================
  Files               82       82              
  Lines            13835    13835              
===============================================
- Hits             11694    11692       -2     
- Misses            2141     2143       +2     
Impacted Files Coverage Δ
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
python/cudf/cudf/core/abc.py 87.23% <0.00%> (-4.26%) ⬇️
python/cudf/cudf/_version.py 45.51% <0.00%> (+0.71%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b352155...c72b04d. Read the comment docs.

@jlowe jlowe deleted the thrust-patch branch September 10, 2021 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
! - Hotfix Hotfix is a bug that affects the majority of users for which there is no reasonable workaround libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants