llava: batch inference #4378

Borobo · 2023-12-08T15:32:23Z

Hello!

I'm using llava with the server and I'm wondering if anyone is working on batch inference by batching llava's clip or not. If not, I would be happy to contribute as this feature could be very useful to speed up inference time for llava.
As i'm new in this wonderful repo, I would also be grateful if you could guide me for the first steps (what file do I have to edit)

Thanks you in advance!

cmp-nct · 2023-12-08T17:16:33Z

Afaik FSSRepo was working on offloading CLIP properly to GPU, there is a PR for that already but it was missing some ggml cuda upgrades (broadcasting features I think).

Currently it runs on CPU only which is a massive slowdown, I don't think parallel processing would help in any way on CPU. But of course on GPU it's a different story.

Borobo · 2023-12-12T08:31:40Z

So I guess I'll have to wait until this PR is merged before considering clip batching functionality, as I'm not very familiar with ggml structures. I'm going to keep an eye on it, thanks for you reply!

y10ab1 · 2023-12-17T05:40:17Z

We can track the PR ggerganov/llama.cpp/pull/4205.
It appears that the corresponding pull request at ggerganov/ggml/pull/621 has been merged.
I guess there may be progress soon.

KohakuBlueleaf · 2024-03-06T01:26:31Z

Is there any updates for this feature?

github-actions · 2024-04-20T01:07:19Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions bot added the stale label Apr 6, 2024

github-actions bot closed this as completed Apr 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llava: batch inference #4378

llava: batch inference #4378

Borobo commented Dec 8, 2023

cmp-nct commented Dec 8, 2023

Borobo commented Dec 12, 2023

y10ab1 commented Dec 17, 2023

KohakuBlueleaf commented Mar 6, 2024

github-actions bot commented Apr 20, 2024

llava: batch inference #4378

llava: batch inference #4378

Comments

Borobo commented Dec 8, 2023

cmp-nct commented Dec 8, 2023

Borobo commented Dec 12, 2023

y10ab1 commented Dec 17, 2023

KohakuBlueleaf commented Mar 6, 2024

github-actions bot commented Apr 20, 2024