[Usage]: I doubt about the meaning of --enable-prefix-caching #4390

chenchunhui97 · 2024-04-26T04:10:09Z

Your current environment

I doubt which version that starting support feat --enable-prefix-caching. (seems v0.3.0 still not support yet, and v0.4.0 has supported it already)
does --enable-prefix-caching mean to prefix system prompt only? or is it an implementation of RadixAttention (https://arxiv.org/abs/2312.07104)? what is the difference between prefix-caching and prefix-sharing (the following implementation)?

if prefix_len != None:
      # start inference
      if prompt_token_ids != None:
          outputs = llm.generate(prompt_token_ids=prompt_token_ids,
                                 sampling_params=sampling_params,
                                 prefix_pos=prefix_len * (len(prompts) // len(prefix_len)))
      else:
          outputs = llm.generate(prompts=prompts,
                                 sampling_params=sampling_params,
                                 prefix_pos=prefix_len * (len(prompts) // len(prefix_len)))
  else:
      outputs = llm.generate(prompts, sampling_params=sampling_params)

How would you like to use vllm

I want to know more details about --enable-prefix-caching and the releated paper.

The text was updated successfully, but these errors were encountered:

zhuohan123 · 2024-04-26T04:20:31Z

Please refer to #2614 for the details for now. We will publish a blogpost explaining our design soon. Stay tuned!

timothylimyl · 2024-05-15T08:05:58Z

@zhuohan123 look forward to the blog.

samos123 · 2024-07-04T22:27:38Z

Was the blog post ever published? Please also update the flag documentation. I'm trying to understand what it actually does.

Playerrrrr · 2024-10-02T00:51:35Z

Where is the blog? @zhuohan123

chenchunhui97 added the usage How to use vllm label Apr 26, 2024

zhuohan123 closed this as completed Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Usage]: I doubt about the meaning of --enable-prefix-caching #4390

[Usage]: I doubt about the meaning of --enable-prefix-caching #4390

chenchunhui97 commented Apr 26, 2024

zhuohan123 commented Apr 26, 2024

timothylimyl commented May 15, 2024

samos123 commented Jul 4, 2024

Playerrrrr commented Oct 2, 2024

[Usage]: I doubt about the meaning of --enable-prefix-caching #4390

[Usage]: I doubt about the meaning of --enable-prefix-caching #4390

Comments

chenchunhui97 commented Apr 26, 2024

Your current environment

How would you like to use vllm

zhuohan123 commented Apr 26, 2024

timothylimyl commented May 15, 2024

samos123 commented Jul 4, 2024

Playerrrrr commented Oct 2, 2024