Skip to content

ggml-cuda : add rope f16, restore performance with parallel decoding#3272

Merged
ggerganov merged 4 commits intocustom-attention-maskfrom cam-cuda-2Sep 20, 2023

Commits

Commits on Sep 19, 2023

Commits on Sep 20, 2023