Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan support #797

Merged
merged 11 commits into from
Jul 13, 2024
Merged

Vulkan support #797

merged 11 commits into from
Jul 13, 2024

Conversation

m0nsky
Copy link
Contributor

@m0nsky m0nsky commented Jun 19, 2024

Based on #517 this PR adds support for the Vulkan backend. The codebase for the native library had changed quite a bit since that setup from february and I've rewritten the Vulkan API detection which now uses a regex to extract the Vulkan API version from vulkaninfo --summary. Tested on both Windows 10 and Ubuntu 22.04.

  • On Windows, when Vulkan is available, it will be used by default (unless CUDA is available)
  • On Linux, it requires vulkan-tools to be installed (sudo apt install vulkan-tools)

Both backends can be configured in the NativeLibraryConfig just like before:

NativeLibraryConfig
   .All
   .WithCuda(false)
   .WithVulkan(true)

I have tested the examples with both CUDA and Vulkan and it seems to work correctly.

vulkan_example_small

Note:
Keep in mind that with llama.cpp commit 1debe72737ea131cb52975da3d53ed3a835df3a6 (which the LLamaSharp June 2024 binary update is based off) Vulkan currently crashes on multi-gpu setups (like mine) when SplitMode is set to GPUSplitMode.None which is the default. Setting this to GPUSplitMode.Layer works correctly. This has since then been fixed in this PR.

Submitting as draft for now as it depends on #795 to be merged first.

@martindevans
Copy link
Member

Note: #795 has now been merged.

@m0nsky
Copy link
Contributor Author

m0nsky commented Jun 19, 2024

Now that both #795 and #799 have been merged, I have done some additional fixes/cleanups to the Vulkan build process. Because the llama_cpp_commit wasn't being taken in account, the binaries were causing a crash on a fresh build.

I've ran the updated build action on my local fork, tested both llama and llava using Vulkan on Windows 10, and both are working as they should.

I've done the same test on WSL Ubuntu 22.04 using Vulkan through llvmpipe, which also worked fine in both automatic backend selection and inference.

Should I remove the draft status for this PR?

@martindevans
Copy link
Member

Should I remove the draft status for this PR?

Presumably this can't be merged until a binary update? If so we should probably leave it as a draft until I start that process.

@m0nsky
Copy link
Contributor Author

m0nsky commented Jun 20, 2024

Yup you're right. We do need the corrected llama_cpp_commit for the Vulkan build before the binary update, so I will split that up into a separate PR.

@m0nsky m0nsky marked this pull request as ready for review July 11, 2024 20:27
@martindevans martindevans merged commit e907146 into SciSharp:master Jul 13, 2024
6 checks passed
@m0nsky m0nsky deleted the vulkan-backend branch July 18, 2024 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants