-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Un-reverting contractions tiling logic + masked l2 nn for 23.04 #1239
Un-reverting contractions tiling logic + masked l2 nn for 23.04 #1239
Conversation
Just adding a note that we are still going to want to fix the unrolling for these changes before they go in |
{ | ||
if (gridStrideX == blockIdx.x * P::Nblk) { this->ldgXY(0); } | ||
|
||
// Reset accumulator registers to zero. | ||
#pragma unroll | ||
for (int i = 0; i < P::AccRowsPerTh; ++i) { | ||
#pragma unroll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recording for posterity: changing these two unrolls to pragma unroll 1
was sufficient to get the code to compile for me locally. Note that Github thinks these lines didn't change, but they have moved into a new function now which meaningfully changes where the unrolling ends up happening.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-23.04 #1239 +/- ##
===============================================
Coverage ? 87.99%
===============================================
Files ? 21
Lines ? 483
Branches ? 0
===============================================
Hits ? 425
Misses ? 58
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving ops-codeowner
file changes
I have no problem with the PR getting merged like this!
|
Closing in favor of #1254 |
As explained in #1246 (comment), ptxas chokes on the minkowski distance when `VecLen==4` and `IdxT==uint32_t`. This PR removes the veclen == 4 specialization for the minkowski distance. Follow up to: #1239 Authors: - Allard Hendriksen (https://github.com/ahendriksen) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Sean Frye (https://github.com/sean-frye) URL: #1254
As explained in rapidsai#1246 (comment), ptxas chokes on the minkowski distance when `VecLen==4` and `IdxT==uint32_t`. This PR removes the veclen == 4 specialization for the minkowski distance. Follow up to: rapidsai#1239 Authors: - Allard Hendriksen (https://github.com/ahendriksen) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Sean Frye (https://github.com/sean-frye) URL: rapidsai#1254
This PR also includes changes from #1235 so it looks big, but it really only has a couple things.
@ahendriksen I pulled these changes over from your other PRs and consolidated them. I don't want to step on your toes so if you would rather publish the commit or apply this patch (to have your name associated w/ the changes) then let me know and I can close this PR. I'm just hoping to get these merged back in early 23.04