Revert "ggml : remove OpenCL (#7735) + (#8235)" #8986

okias · 2024-08-11T15:47:50Z

Needs more work, thus Draft....

Reason: Within Mesa3D, we support OpenCL for multiple vendors, which are unlikely to be covered by other technologies. So far, in the Linux world, there isn't a better standard (Vulkan compute is not a viable replacement for OpenCL, even though it provides some of its functionality).

This reverts commit 554c247. This reverts commit 257f8e4.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

Manually adjusted. This reverts commit 554c247. This reverts commit 257f8e4. Signed-off-by: David Heidelberg <[email protected]>

qnixsynapse · 2024-08-11T17:34:20Z

Instead of bringing back (partial) opencl, why not improve and maintain SYCL which is probably the better alternative across all hardware?

sorasoras · 2024-08-11T17:51:30Z

as far as I know, Opencl removed due to lack of full implantation and people that support that backend.
unless someone is committed to spend time on maintain it, I don't see much of point to revert this change.

ExtReMLapin · 2024-08-11T20:46:23Z

as far as I know, Opencl removed due to lack of full implantation and people that support that backend. unless someone is committed to spend time on maintain it, I don't see much of point to revert this change.

Agree on that, either stick to the oldest version without it removed or be ready to maintain it.

0cc4m · 2024-08-12T12:09:27Z

Vulkan compute is not a viable replacement for OpenCL, even though it provides some of its functionality

Why? Which use cases are covered by OpenCL, but not by Vulkan?

okias · 2024-08-13T14:06:45Z

There are a lot of aarch64-based GPUs, that offer GL + OpenCL combo, but Vulkan is not available there, or these GPUs weren't built with Vulkan in mind (= drivers are poorly working, Mesa3D OSS driver will never implement support for them due to hackish nature of the VK support).

Also OpenCL usually performs better than VK compute, you can see the user feedback on #7735

0cc4m · 2024-08-13T14:27:30Z

There are a lot of aarch64-based GPUs, that offer GL + OpenCL combo, but Vulkan is not available there, or these GPUs weren't built with Vulkan in mind (= drivers are poorly working, Mesa3D OSS driver will never implement support for them due to hackish nature of the VK support).

Also OpenCL usually performs better than VK compute, you can see the user feedback on #7735

I know that there are mobile GPUs with support for OpenCL, but not Vulkan, but I doubt that the OpenCL implementation runs well on those.

But a blanket statement that "OpenCL usually performs better than VK compute" is plain wrong and makes me wonder if you ever actually checked that. The comments you linked are outdated and I have fixed a lot of issues since they were written.

As the author of most of the OpenCL and Vulkan code, I can tell you that the current Vulkan backend is in a much much better state than OpenCL ever was and outperforms it by a lot on most GPUs.

It is possible to get OpenCL to a similar state, but that requires a very big overhaul and a lot of kernel work that, until now, nobody has been willing to do.

qnixsynapse · 2024-08-13T14:47:17Z

OpenCL can be used as a backend for SYCL.

Also, Occam is correct here about Vulkan.

okias · 2024-08-13T14:51:36Z

I know that there are mobile GPUs with support for OpenCL, but not Vulkan, but I doubt that the OpenCL implementation runs well on those.

There is a lot of activity around RustiCL in Mesa3D. If we could instead CL implement VK compute for these, it would be great, but probably not achievable.

About the OCL. SyCL, VK compute, there is nice sum up from Dave Airlie https://www.youtube.com/watch?v=HzzLY5TdnZo (at least for the open-source world)

0cc4m · 2024-08-14T07:08:25Z

About the OCL. SyCL, VK compute, there is nice sum up from Dave Airlie https://www.youtube.com/watch?v=HzzLY5TdnZo (at least for the open-source world)

Dave is not particularly positive about OpenCL there either. It sounds to me like you're coming from a more general position, why OpenCL might be better than Vulkan, while I'm mostly focused on this project and my experience with the two APIs.

Nobody in llama.cpp is opposed to having an OpenCL backend, but the situation is simply that nobody has been willing to put in the time to develop and maintain it. I switched to Vulkan after running into serious limitations when developing the OpenCL backend, Vulkan is a lot better supported by the hardware vendors.

If you want to put in the work to bring the OpenCL backend up to speed, I'm happy to assist as much as I can, but in the current state I don't think it will be accepted back into the project.

Sur3 · 2024-08-14T18:36:07Z

I know that there are mobile GPUs with support for OpenCL, but not Vulkan, but I doubt that the OpenCL implementation runs well on those.

It runs more than well, I can use my whole system RAM as VRAM with the old OpenCL backend on my integrated Skylake GT2 [HD Graphics 520] GPU, with Vulkan I just run out of Memory instead in most cases.

slaren · 2024-08-14T18:51:55Z

In order to reintroduce the OpenCL backend to ggml, the first step would be to complete the integration with ggml-backend. For the most part, this would mean that instead of intercepting calls to the CPU backend in ggml.c, the OpenCL backend would need to be able to evaluate graphs though the graph_compute function of the ggml-backend interface. To do so, it would also need to be able to work with tensors allocated the device memory (all the tensors, not just the weights).

Sur3 · 2024-08-14T19:17:57Z

In order to reintroduce the OpenCL backend to ggml, [...]

Do we need ggml? Wasn't this superseeded by gguf already even with the OpenCL backend?

0cc4m · 2024-08-14T19:24:21Z

I know that there are mobile GPUs with support for OpenCL, but not Vulkan, but I doubt that the OpenCL implementation runs well on those.

It runs more than well, I can use my whole system RAM as VRAM with the old OpenCL backend on my integrated Skylake GT2 [HD Graphics 520] GPU, with Vulkan I just run out of Memory instead in most cases.

It runs, but it isn't worth using. Here's how it looks for a 1.1B tinyllama on my Intel i7-10510U iGPU (tested using koboldcpp benchmark mode, because that still supports OpenCL)

Backend	pp t/s	tg t/s
No BLAS	47	21
OpenBLAS	61	21
CLBlast 0 layers	36	21
CLBlast 100 layers	36	8
Vulkan 0 layers	23	24
Vulkan 100 layers	crash	crash

In order to reintroduce the OpenCL backend to ggml, [...]

Do we need ggml? Wasn't this superseeded by gguf already even with the OpenCL backend?

ggml the library, not the file format. llama.cpp is a llama implementation based on the ggml library.

syxrz789 · 2024-08-19T08:56:51Z

If someone could maintain OpenCL independently of the CPU backend like CUDA, that would be great. The current OpenCL backend has too many CPUs interacting with the OCL's writebuffer, resulting in poor performance

mtasic85 · 2024-08-20T13:45:43Z

https://github.com/ggerganov/llama.cpp/blob/master/docs/backend/SYCL.md#llamacpp--sycl

The llama.cpp SYCL backend is designed to support Intel GPU firstly. Based on the cross-platform feature of SYCL, it could support other vendor GPUs: Nvidia GPU (AMD GPU coming).

Can someone point to AMD GPU PR/branch or this is just plan for future?

rhjdvsgsgks · 2024-10-20T14:52:44Z

rusticl have some plan (page 9) to support dpcpp and chipstar, which provides sycl and rocm on top of opencl. AdaptiveCpp (a open source sycl implementation) have also a opencl backend. can anyone test these?

Revert "ggml : remove OpenCL (ggerganov#7735) + (ggerganov#8235)"

346f64f

Manually adjusted. This reverts commit 554c247. This reverts commit 257f8e4. Signed-off-by: David Heidelberg <[email protected]>

jhowilbur mentioned this pull request Nov 2, 2024

CUDA full GPU acceleration, KV cache in VRAM #1827

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "ggml : remove OpenCL (#7735) + (#8235)" #8986

Revert "ggml : remove OpenCL (#7735) + (#8235)" #8986

okias commented Aug 11, 2024 •

edited

Loading

qnixsynapse commented Aug 11, 2024

sorasoras commented Aug 11, 2024

ExtReMLapin commented Aug 11, 2024

0cc4m commented Aug 12, 2024

okias commented Aug 13, 2024

0cc4m commented Aug 13, 2024

qnixsynapse commented Aug 13, 2024

okias commented Aug 13, 2024 •

edited

Loading

0cc4m commented Aug 14, 2024

Sur3 commented Aug 14, 2024 •

edited

Loading

slaren commented Aug 14, 2024

Sur3 commented Aug 14, 2024

0cc4m commented Aug 14, 2024

syxrz789 commented Aug 19, 2024

mtasic85 commented Aug 20, 2024

rhjdvsgsgks commented Oct 20, 2024 •

edited

Loading

Revert "ggml : remove OpenCL (#7735) + (#8235)" #8986

Are you sure you want to change the base?

Revert "ggml : remove OpenCL (#7735) + (#8235)" #8986

Conversation

okias commented Aug 11, 2024 • edited Loading

qnixsynapse commented Aug 11, 2024

sorasoras commented Aug 11, 2024

ExtReMLapin commented Aug 11, 2024

0cc4m commented Aug 12, 2024

okias commented Aug 13, 2024

0cc4m commented Aug 13, 2024

qnixsynapse commented Aug 13, 2024

okias commented Aug 13, 2024 • edited Loading

0cc4m commented Aug 14, 2024

Sur3 commented Aug 14, 2024 • edited Loading

slaren commented Aug 14, 2024

Sur3 commented Aug 14, 2024

0cc4m commented Aug 14, 2024

syxrz789 commented Aug 19, 2024

mtasic85 commented Aug 20, 2024

rhjdvsgsgks commented Oct 20, 2024 • edited Loading

okias commented Aug 11, 2024 •

edited

Loading

okias commented Aug 13, 2024 •

edited

Loading

Sur3 commented Aug 14, 2024 •

edited

Loading

rhjdvsgsgks commented Oct 20, 2024 •

edited

Loading