-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan support #797
Vulkan support #797
Conversation
Note: #795 has now been merged. |
Now that both #795 and #799 have been merged, I have done some additional fixes/cleanups to the Vulkan build process. Because the I've ran the updated build action on my local fork, tested both llama and llava using Vulkan on Windows 10, and both are working as they should. I've done the same test on WSL Ubuntu 22.04 using Vulkan through llvmpipe, which also worked fine in both automatic backend selection and inference. Should I remove the draft status for this PR? |
Presumably this can't be merged until a binary update? If so we should probably leave it as a draft until I start that process. |
Yup you're right. We do need the corrected |
Based on #517 this PR adds support for the Vulkan backend. The codebase for the native library had changed quite a bit since that setup from february and I've rewritten the Vulkan API detection which now uses a regex to extract the Vulkan API version from
vulkaninfo --summary
. Tested on both Windows 10 and Ubuntu 22.04.vulkan-tools
to be installed (sudo apt install vulkan-tools
)Both backends can be configured in the
NativeLibraryConfig
just like before:I have tested the examples with both CUDA and Vulkan and it seems to work correctly.
Note:
Keep in mind that with llama.cpp commit
1debe72737ea131cb52975da3d53ed3a835df3a6
(which the LLamaSharp June 2024 binary update is based off) Vulkan currently crashes on multi-gpu setups (like mine) whenSplitMode
is set toGPUSplitMode.None
which is the default. Setting this toGPUSplitMode.Layer
works correctly. This has since then been fixed in this PR.Submitting as draft for now as it depends on #795 to be merged first.