-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for properly optimized Windows ARM64 builds with LLVM and MSVC #7191
Conversation
OMG. Thank you for this. Do you think that the 8cx Gen 3 will benefit from these changes? Also. Would support QNN for Windows be too complicated? |
34715fb
to
34b669c
Compare
@ggerganov |
@max-krasnyansky Understood. |
Interesting. I didn't know int8 matmul works on 8cx gen3. That's great! |
Well. I built it with
I'm trying |
@max-krasnyansky Ok bad news. I should have sticked with llama2 while testing. I don't understand why (this is completely out of my league), but llama3 works even when compiling with In other words my lack of domain here led me to speak too soon. I couldn't imagine that different models would lead to different CPUs instructions being used 😓 |
@hmartinez82 |
Here's my llama2
and here's my llama3
I'm going to download Q4_0 of llama3. But anyway. |
Co-authored-by: Georgi Gerganov <[email protected]>
34b669c
to
3978014
Compare
@ggerganov |
Could you add some documentation about how to use the |
Ah. I'm going to add a full section in readme how to build native Windows ARM64. Here is how to build with LLVM/Clang using CMake Presets:
Here is how to build with MSVC
This all works with MS Visual Studio 2022 Community Edition. |
@max-krasnyansky Now who's going to be the good samaritan and add support for the 8cx NPU😅. It has MATMUL support I think . |
@slaren Please don't forget to hit that merge button :) |
… MSVC (ggerganov#7191) * logging: add proper checks for clang to avoid errors and warnings with VA_ARGS * build: add CMake Presets and toolchian files for Windows ARM64 * matmul-int8: enable matmul-int8 with MSVC and fix Clang warnings * ci: add support for optimized Windows ARM64 builds with MSVC and LLVM * matmul-int8: fixed typos in q8_0_q8_0 matmuls Co-authored-by: Georgi Gerganov <[email protected]> * matmul-int8: remove unnecessary casts in q8_0_q8_0 --------- Co-authored-by: Georgi Gerganov <[email protected]>
The |
Yes, same here. It forces you to select one of the presets, right? |
Yes, it does. Every time. Once I name it, it overwrites the file. The branch is affected afterwards as a result. |
Odd. I don't use Visual Studio Code but it seems to me like a settings issue. |
"name": "arm64-windows-msvc", "hidden": true, | ||
"architecture": { "value": "arm64", "strategy": "external" }, | ||
"toolset": { "value": "host=x86_64", "strategy": "external" }, | ||
"cacheVariables": { | ||
"CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/arm64-windows-msvc.cmake" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@max-krasnyansky This is windows specific.
"name": "arm64-windows-llvm", "hidden": true, | ||
"architecture": { "value": "arm64", "strategy": "external" }, | ||
"toolset": { "value": "host=x86_64", "strategy": "external" }, | ||
"cacheVariables": { | ||
"CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/arm64-windows-llvm.cmake" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@max-krasnyansky This is windows specific.
{ "name": "arm64-windows-llvm-debug" , "inherits": [ "base", "arm64-windows-llvm", "debug" ] }, | ||
{ "name": "arm64-windows-llvm-release", "inherits": [ "base", "arm64-windows-llvm", "release" ] }, | ||
{ "name": "arm64-windows-llvm+static-release", "inherits": [ "base", "arm64-windows-llvm", "release", "static" ] }, | ||
|
||
{ "name": "arm64-windows-msvc-debug" , "inherits": [ "base", "arm64-windows-msvc", "debug" ] }, | ||
{ "name": "arm64-windows-msvc-release", "inherits": [ "base", "arm64-windows-msvc", "release" ] }, | ||
{ "name": "arm64-windows-msvc+static-release", "inherits": [ "base", "arm64-windows-msvc", "release", "static" ] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@max-krasnyansky This is windows specific.
@max-krasnyansky These are usually auto-generated, but can be hand-crafted. |
Please see the CMake documentation link I included above. And yes, the things you listed are windows specific, that's the whole point, we added native windows arm64 build ;-) |
I did read it. It doesn't change the fact that these settings are system specific. This file should be ignored. |
I am not sure that we need to make changes to accommodate what seems to be a buggy or misconfigured VS Code extension. FWITW I use VS Code, but not the cmake extension, because I always found it more annoying than useful. |
@slaren I have to concur with you. The CMake extension should not force us to use the presets just because they happen to be in the file system. |
@slaren These are system specific settings. They are settings geared towards ARM builds on Microsoft Windows. While the settings can be inclusive, it doesn't change the current state of the file. I respect your opinion and input. I have nothing left to say or add to this discussion. I stand by what I've said. |
Currently Windows ARM64 builds are not properly optimized, which results in low token
rates on Windows ARM64 platforms such as the upcoming Snapgradon X-Elite & Plus.
This update adds / resolves the following things:
and improves MatMul-INT8 NEON intrinsics usage in general
We're using LLVM 16.x included in MS Visual Studio 2022
All Windows cmake build targets now explicitly say x64 or arm64
Here are some before/after token rates from a Snapdragon X-Elite-based laptop.
Here is how to build with LLVM/Clang using CMake Presets:
Here is how to build with MSVC
This all works with MS Visual Studio 2022 Community Edition.
One just needs to enable all native ARM64 related features, and install LLVM/Clang add-on.
Hosted Github CI Runners already include all that.