IVF-PQ: Add a (faster) direct conversion fp8->half #1644

achirkin · 2023-07-12T15:24:20Z

Add a missing direct conversion from the custom ivf-pq fp_8bit type to half. This conversion is used in a tight ALU-bound loop that computes the distances between the query and the encoded cluster vectors.

This change improves QPS by 0-20% on the deep-1B dataset when internal_distance_dtype = CUDA_R_16F and lut_dtype = CUDA_R_8U. Note, however, on this dataset the <CUDA_R_16F, CUDA_R_8U> combination is often still slower than the <CUDA_R_16F, CUDA_R_16F> combination due to the couple extra ALU instructions. lut_dtype = CUDA_R_8U is the most beneficial when there's not enough shared memory for the lookup table, which is not the case for the best-performing parameter combinations on the deep-1B dataset.

tfeher

Thanks @achirkin for the PR, it looks good to me.

dantegd · 2023-07-14T15:45:15Z

/merge

#1644 introduced a direct conversion from the custom ivf-pq fp8 type to half and a bug alongside. The exponent bias value wrong by one bit.

#1644 introduced a direct conversion from the custom ivf-pq fp8 type to half and a bug alongside. The exponent bias value wrong by one bit. The result is the loss of precision in some datasets. This hotfix just changes this constant. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) URL: #1654

Add a (faster) direct conversion fp8->half

29aff1c

achirkin added 3 - Ready for Review improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 12, 2023

achirkin requested a review from a team as a code owner July 12, 2023 15:24

github-actions bot added the cpp label Jul 12, 2023

achirkin requested a review from tfeher July 12, 2023 15:24

achirkin added a commit to achirkin/raft that referenced this pull request Jul 12, 2023

Carry rapidsai#1644 over

3701e21

tfeher approved these changes Jul 14, 2023

View reviewed changes

Merge branch 'branch-23.08' into enh-ivf-pq-faster-fp8-half-conversion

6619c00

rapids-bot bot merged commit cb7d01a into rapidsai:branch-23.08 Jul 14, 2023

achirkin added a commit that referenced this pull request Jul 19, 2023

Hotfix: wrong constant in IVF-PQ fp_8bit2half

91cf828

#1644 introduced a direct conversion from the custom ivf-pq fp8 type to half and a bug alongside. The exponent bias value wrong by one bit.

achirkin mentioned this pull request Jul 19, 2023

Hotfix: wrong constant in IVF-PQ fp_8bit2half #1654

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IVF-PQ: Add a (faster) direct conversion fp8->half #1644

IVF-PQ: Add a (faster) direct conversion fp8->half #1644

achirkin commented Jul 12, 2023

tfeher left a comment

dantegd commented Jul 14, 2023

IVF-PQ: Add a (faster) direct conversion fp8->half #1644

IVF-PQ: Add a (faster) direct conversion fp8->half #1644

Conversation

achirkin commented Jul 12, 2023

tfeher left a comment

Choose a reason for hiding this comment

dantegd commented Jul 14, 2023