Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Grid stride pairwise dist and fused L2 NN kernels #232

Merged
merged 11 commits into from
Jun 2, 2021

Conversation

mdoijade
Copy link
Contributor

@mdoijade mdoijade commented May 19, 2021

This PR addresses issues mentioned in #221
-- Adds grid stride based fusedL2NN kernel, this gives approx 1.85x speed up over previous version of this kernel.
-- Adds support in pairwise dist base class to work for any input size by adding support for grid stride based work distribution.

mdoijade added 6 commits May 12, 2021 22:03
…r usage in all contraction based kernels so that n is along x dir and m is along y dir blocks
…kernels. --add launch config generator function to launch optimal grid size kernel for these pairwise dist kernels
…ed up over previous version. -- improve logic of the grid launch config generator for x-dir blocks
@mdoijade mdoijade requested review from divyegala and a team as code owners May 19, 2021 15:30
@github-actions github-actions bot added the cpp label May 19, 2021
… for subsequent gridStrideX variations. this overall improves perf of fusedL2NN to 1.85x over previous version. --Also remove checking keys only check values in fusedL2nn test case, as it may happen a row has multiple keys with same min val
@mdoijade
Copy link
Contributor Author

@teju85 for help with reviewing and adding appropriate label

cpp/include/raft/distance/cosine.cuh Outdated Show resolved Hide resolved
cpp/include/raft/distance/euclidean.cuh Outdated Show resolved Hide resolved
cpp/include/raft/distance/pairwise_distance_base.cuh Outdated Show resolved Hide resolved
@teju85 teju85 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 22, 2021
…und in launchConfigGenerator. --Use constexpr in shmemSize.
@mdoijade
Copy link
Contributor Author

@mdoijade
Copy link
Contributor Author

@teju85 can we merge this ? if not in branch-21.06 then in branch-21.08?

Copy link
Member

@teju85 teju85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

@teju85
Copy link
Member

teju85 commented May 27, 2021

@BradReesWork and/or @afender can we get approval for this from cugraph side? (I believe since you are not using pairwise distances logic anywhere in cugraph, the changes in this PR should not affect you)

@teju85
Copy link
Member

teju85 commented May 27, 2021

@divyegala you are still listed as a code owner and thus we'll need your approval too before merging!

Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mdoijade
Copy link
Contributor Author

mdoijade commented Jun 2, 2021

@teju85 @dantegd can we merge this now? the cuML side test PR is working fine as well the L1 dist intermittent issue seems resolved - rapidsai/cuml#3891

Copy link
Member

@afender afender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!.

@teju85
Copy link
Member

teju85 commented Jun 2, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit e8f1862 into rapidsai:branch-21.06 Jun 2, 2021
dantegd added a commit to dantegd/raft that referenced this pull request Jun 2, 2021
rapids-bot bot pushed a commit that referenced this pull request Jun 2, 2021
After the merge of #232, a few different tests failed in rapidsai/cuml#3891, given the timing I think it'd be best to target 232 (again) to 21.08 after triaging the issues.

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Divye Gala (https://github.com/divyegala)
  - Brad Rees (https://github.com/BradReesWork)

URL: #246
rapids-bot bot pushed a commit that referenced this pull request Jun 11, 2021
This PR addresses issues mentioned in #221
-- Adds grid stride based fusedL2NN kernel, this gives approx 1.85x speed up over previous version of this kernel.
-- Adds support in pairwise dist base class to work for any input size by adding support for grid stride based work distribution.

This was submitted to branch-21.06 through PR - #232 
but later reverted due to intermittent failure by - #246

Authors:
  - Mahesh Doijade (https://github.com/mdoijade)

Approvers:
  - Thejaswi. N. S (https://github.com/teju85)
  - Brad Rees (https://github.com/BradReesWork)

URL: #250
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants