Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IVF-PQ: Add a (faster) direct conversion fp8->half #1644

Conversation

achirkin
Copy link
Contributor

Add a missing direct conversion from the custom ivf-pq fp_8bit type to half. This conversion is used in a tight ALU-bound loop that computes the distances between the query and the encoded cluster vectors.

This change improves QPS by 0-20% on the deep-1B dataset when internal_distance_dtype = CUDA_R_16F and lut_dtype = CUDA_R_8U. Note, however, on this dataset the <CUDA_R_16F, CUDA_R_8U> combination is often still slower than the <CUDA_R_16F, CUDA_R_16F> combination due to the couple extra ALU instructions. lut_dtype = CUDA_R_8U is the most beneficial when there's not enough shared memory for the lookup table, which is not the case for the best-performing parameter combinations on the deep-1B dataset.

@achirkin achirkin added 3 - Ready for Review improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 12, 2023
@achirkin achirkin requested a review from a team as a code owner July 12, 2023 15:24
@github-actions github-actions bot added the cpp label Jul 12, 2023
@achirkin achirkin requested a review from tfeher July 12, 2023 15:24
achirkin added a commit to achirkin/raft that referenced this pull request Jul 12, 2023
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @achirkin for the PR, it looks good to me.

@dantegd
Copy link
Member

dantegd commented Jul 14, 2023

/merge

@rapids-bot rapids-bot bot merged commit cb7d01a into rapidsai:branch-23.08 Jul 14, 2023
achirkin added a commit that referenced this pull request Jul 19, 2023
#1644 introduced a direct conversion from the custom ivf-pq fp8 type to half and a bug alongside. The exponent bias value wrong by one bit.
rapids-bot bot pushed a commit that referenced this pull request Jul 19, 2023
#1644 introduced a direct conversion from the custom ivf-pq fp8 type to half and a bug alongside. The exponent bias value wrong by one bit. The result is the loss of precision in some datasets.
This hotfix just changes this constant.

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1654
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants