The support on vLLM? #11

KexinFeng · 2024-04-17T22:55:00Z

Hi,

I remember the support on vLLM was on your TODOs. Have you achieved it now? Was the main challenge in this direction that the batch size > 1 tree verification is hard to made efficient? Thanks!

dreaming-panda · 2024-04-21T01:56:47Z

Currently we have not added support for vLLM and are working to build a tensor parallelism system. With batch size > 1, we need to solve some additional problems, such as the #accepted tokens can be different for each request in the same batch. And the communication time is not considered in current implementation. After we build the tensor parallelism system, we will make it compatible with vLLM or other inference engines. Thank you！

lethean1 · 2024-10-13T03:25:05Z

Have you achieved tensor parallelism now?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The support on vLLM? #11

The support on vLLM? #11

KexinFeng commented Apr 17, 2024

dreaming-panda commented Apr 21, 2024

lethean1 commented Oct 13, 2024

The support on vLLM? #11

The support on vLLM? #11

Comments

KexinFeng commented Apr 17, 2024

dreaming-panda commented Apr 21, 2024

lethean1 commented Oct 13, 2024