Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "ggml : remove OpenCL (#7735) + (#8235)" #8986

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

okias
Copy link

@okias okias commented Aug 11, 2024

Needs more work, thus Draft....

Reason: Within Mesa3D, we support OpenCL for multiple vendors, which are unlikely to be covered by other technologies. So far, in the Linux world, there isn't a better standard (Vulkan compute is not a viable replacement for OpenCL, even though it provides some of its functionality).

This reverts commit 554c247. This reverts commit 257f8e4.

Manually adjusted.

This reverts commit 554c247.
This reverts commit 257f8e4.

Signed-off-by: David Heidelberg <[email protected]>
@github-actions github-actions bot added documentation Improvements or additions to documentation build Compilation issues script Script related nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment examples python python script changes devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Aug 11, 2024
@qnixsynapse
Copy link
Contributor

Instead of bringing back (partial) opencl, why not improve and maintain SYCL which is probably the better alternative across all hardware?

@sorasoras
Copy link

as far as I know, Opencl removed due to lack of full implantation and people that support that backend.
unless someone is committed to spend time on maintain it, I don't see much of point to revert this change.

@ExtReMLapin
Copy link
Contributor

as far as I know, Opencl removed due to lack of full implantation and people that support that backend. unless someone is committed to spend time on maintain it, I don't see much of point to revert this change.

Agree on that, either stick to the oldest version without it removed or be ready to maintain it.

@0cc4m
Copy link
Collaborator

0cc4m commented Aug 12, 2024

Vulkan compute is not a viable replacement for OpenCL, even though it provides some of its functionality

Why? Which use cases are covered by OpenCL, but not by Vulkan?

@okias
Copy link
Author

okias commented Aug 13, 2024

There are a lot of aarch64-based GPUs, that offer GL + OpenCL combo, but Vulkan is not available there, or these GPUs weren't built with Vulkan in mind (= drivers are poorly working, Mesa3D OSS driver will never implement support for them due to hackish nature of the VK support).

Also OpenCL usually performs better than VK compute, you can see the user feedback on #7735

@0cc4m
Copy link
Collaborator

0cc4m commented Aug 13, 2024

There are a lot of aarch64-based GPUs, that offer GL + OpenCL combo, but Vulkan is not available there, or these GPUs weren't built with Vulkan in mind (= drivers are poorly working, Mesa3D OSS driver will never implement support for them due to hackish nature of the VK support).

Also OpenCL usually performs better than VK compute, you can see the user feedback on #7735

I know that there are mobile GPUs with support for OpenCL, but not Vulkan, but I doubt that the OpenCL implementation runs well on those.

But a blanket statement that "OpenCL usually performs better than VK compute" is plain wrong and makes me wonder if you ever actually checked that. The comments you linked are outdated and I have fixed a lot of issues since they were written.

As the author of most of the OpenCL and Vulkan code, I can tell you that the current Vulkan backend is in a much much better state than OpenCL ever was and outperforms it by a lot on most GPUs.

It is possible to get OpenCL to a similar state, but that requires a very big overhaul and a lot of kernel work that, until now, nobody has been willing to do.

@qnixsynapse
Copy link
Contributor

OpenCL can be used as a backend for SYCL.

Also, Occam is correct here about Vulkan.

@okias
Copy link
Author

okias commented Aug 13, 2024

I know that there are mobile GPUs with support for OpenCL, but not Vulkan, but I doubt that the OpenCL implementation runs well on those.

There is a lot of activity around RustiCL in Mesa3D. If we could instead CL implement VK compute for these, it would be great, but probably not achievable.

About the OCL. SyCL, VK compute, there is nice sum up from Dave Airlie https://www.youtube.com/watch?v=HzzLY5TdnZo (at least for the open-source world)

@0cc4m
Copy link
Collaborator

0cc4m commented Aug 14, 2024

About the OCL. SyCL, VK compute, there is nice sum up from Dave Airlie https://www.youtube.com/watch?v=HzzLY5TdnZo (at least for the open-source world)

Dave is not particularly positive about OpenCL there either. It sounds to me like you're coming from a more general position, why OpenCL might be better than Vulkan, while I'm mostly focused on this project and my experience with the two APIs.

Nobody in llama.cpp is opposed to having an OpenCL backend, but the situation is simply that nobody has been willing to put in the time to develop and maintain it. I switched to Vulkan after running into serious limitations when developing the OpenCL backend, Vulkan is a lot better supported by the hardware vendors.

If you want to put in the work to bring the OpenCL backend up to speed, I'm happy to assist as much as I can, but in the current state I don't think it will be accepted back into the project.

@Sur3
Copy link

Sur3 commented Aug 14, 2024

I know that there are mobile GPUs with support for OpenCL, but not Vulkan, but I doubt that the OpenCL implementation runs well on those.

It runs more than well, I can use my whole system RAM as VRAM with the old OpenCL backend on my integrated Skylake GT2 [HD Graphics 520] GPU, with Vulkan I just run out of Memory instead in most cases.

@slaren
Copy link
Collaborator

slaren commented Aug 14, 2024

In order to reintroduce the OpenCL backend to ggml, the first step would be to complete the integration with ggml-backend. For the most part, this would mean that instead of intercepting calls to the CPU backend in ggml.c, the OpenCL backend would need to be able to evaluate graphs though the graph_compute function of the ggml-backend interface. To do so, it would also need to be able to work with tensors allocated the device memory (all the tensors, not just the weights).

@Sur3
Copy link

Sur3 commented Aug 14, 2024

In order to reintroduce the OpenCL backend to ggml, [...]

Do we need ggml? Wasn't this superseeded by gguf already even with the OpenCL backend?

@0cc4m
Copy link
Collaborator

0cc4m commented Aug 14, 2024

I know that there are mobile GPUs with support for OpenCL, but not Vulkan, but I doubt that the OpenCL implementation runs well on those.

It runs more than well, I can use my whole system RAM as VRAM with the old OpenCL backend on my integrated Skylake GT2 [HD Graphics 520] GPU, with Vulkan I just run out of Memory instead in most cases.

It runs, but it isn't worth using. Here's how it looks for a 1.1B tinyllama on my Intel i7-10510U iGPU (tested using koboldcpp benchmark mode, because that still supports OpenCL)

Backend pp t/s tg t/s
No BLAS 47 21
OpenBLAS 61 21
CLBlast 0 layers 36 21
CLBlast 100 layers 36 8
Vulkan 0 layers 23 24
Vulkan 100 layers crash crash

In order to reintroduce the OpenCL backend to ggml, [...]

Do we need ggml? Wasn't this superseeded by gguf already even with the OpenCL backend?

ggml the library, not the file format. llama.cpp is a llama implementation based on the ggml library.

@syxrz789
Copy link

If someone could maintain OpenCL independently of the CPU backend like CUDA, that would be great. The current OpenCL backend has too many CPUs interacting with the OCL's writebuffer, resulting in poor performance

@mtasic85
Copy link
Contributor

https://github.com/ggerganov/llama.cpp/blob/master/docs/backend/SYCL.md#llamacpp--sycl

The llama.cpp SYCL backend is designed to support Intel GPU firstly. Based on the cross-platform feature of SYCL, it could support other vendor GPUs: Nvidia GPU (AMD GPU coming).

Can someone point to AMD GPU PR/branch or this is just plan for future?

@rhjdvsgsgks
Copy link
Contributor

rhjdvsgsgks commented Oct 20, 2024

rusticl have some plan (page 9) to support dpcpp and chipstar, which provides sycl and rocm on top of opencl. AdaptiveCpp (a open source sycl implementation) have also a opencl backend. can anyone test these?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apple Metal https://en.wikipedia.org/wiki/Metal_(API) build Compilation issues devops improvements to build systems and github actions documentation Improvements or additions to documentation examples ggml changes relating to the ggml tensor library for machine learning nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment python python script changes script Script related SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants