Support speculative decoding with llama.cpp #240
Labels
feature
Categorizes issue or PR as related to a new feature.
help wanted
Extra attention is needed
needs-priority
Indicates a PR lacks a label and requires one.
needs-triage
Indicates an issue or PR lacks a label and requires one.
What would you like to be added:
We have supported vllm, since llama.cpp adds this feature, we should support it as well, see ggerganov/llama.cpp#10455
Why is this needed:
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.
The text was updated successfully, but these errors were encountered: