You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I doubt which version that starting support feat --enable-prefix-caching. (seems v0.3.0 still not support yet, and v0.4.0 has supported it already)
does --enable-prefix-caching mean to prefix system prompt only? or is it an implementation of RadixAttention (https://arxiv.org/abs/2312.07104)? what is the difference between prefix-caching and prefix-sharing (the following implementation)?
Your current environment
--enable-prefix-caching
. (seems v0.3.0 still not support yet, and v0.4.0 has supported it already)--enable-prefix-caching
mean to prefix system prompt only? or is it an implementation of RadixAttention (https://arxiv.org/abs/2312.07104)? what is the difference between prefix-caching and prefix-sharing (the following implementation)?How would you like to use vllm
I want to know more details about
--enable-prefix-caching
and the releated paper.The text was updated successfully, but these errors were encountered: