-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows native port #2478
Windows native port #2478
Conversation
Please fix the CI failures. And can we add testing to prevent regression? |
Sure, we will add windows CI too but at the moment we don't have almost any windows infrastructure, and we need to figure out how to do automation. All of the bash scripts are not going to run on windows. |
d1baf6d
to
45d26b3
Compare
CI is green on Linux so PR is ready for review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended to be upstreamed?
include/triton/Conversion/TritonToTritonGPU/TritonToTritonGPUPass.h
Outdated
Show resolved
Hide resolved
36d0abf
to
705730c
Compare
* based on Windows support PR triton-lang#2465 by @andreigh - triton-lang#2465 * manually applied, rebased, fix lint errors * use sysconfig.get_config_var() to get the path of python*.lib * clang fix for windows * remove '-fPIC' for windows clang * fix download_and_copy() to support windows * add "exe" extension for windows * use "pyd" extension for windows to make importlib work * third_party/nvidia: fix for windows * win32 fix _path_to_binary() * add library_dir, include_dir for win32 * backend/compiler lazy remove temp files to support windows * additional works done by @mantaionut (2024/05/31) * rework for latest triton and cleanup (2024/10/14) * extract minimal fixes to support win32+clang (2024/10/16) * get exe/so extension using sysconfig (suggested by @anmyachev) see also: intel/intel-xpu-backend-for-triton#2478 Original-author-by: Andrei Gheorghe <[email protected]> Signed-off-by: Won-Kyu Park <[email protected]>
* based on Windows support PR triton-lang#2465 by @andreigh - triton-lang#2465 * manually applied, rebased, fix lint errors * remove '/A' platform option to use ninja * use sysconfig.get_config_var() to get the path of python*.lib * clang fix for windows * remove '-fPIC' for windows clang * fix download_and_copy() to support windows * add "exe" extension for windows * use "pyd" extension for windows to make importlib work * third_party/nvidia: fix for windows * win32 fix _path_to_binary() * add library_dir, include_dir for win32 * backend/compiler lazy remove temp files to support windows * additional works done by @mantaionut (2024/05/31) * rework for latest triton and cleanup (2024/10/14) * extract minimal fixes to support win32+clang (2024/10/16) * get exe/so extension using sysconfig (suggested by @anmyachev) see also: intel/intel-xpu-backend-for-triton#2478 Original-author-by: Andrei Gheorghe <[email protected]> Signed-off-by: Won-Kyu Park <[email protected]>
* based on Windows support PR triton-lang#2465 by @andreigh - triton-lang#2465 * manually applied, rebased, fix lint errors * remove '/A' platform option to use ninja * use sysconfig.get_config_var() to get the path of python*.lib * clang fix for windows * remove '-fPIC' for windows clang * fix download_and_copy() to support windows * add "exe" extension for windows * use "pyd" extension for windows to make importlib work * third_party/nvidia: fix for windows * win32 fix _path_to_binary() * add library_dir, include_dir for win32 * backend/compiler lazy remove temp files to support windows * additional works done by @mantaionut (2024/05/31) * rework for latest triton and cleanup (2024/10/14) * extract minimal fixes to support win32+clang (2024/10/16) * get exe/so extension using sysconfig (suggested by @anmyachev) see also: intel/intel-xpu-backend-for-triton#2478 Original-author-by: Andrei Gheorghe <[email protected]> Signed-off-by: Won-Kyu Park <[email protected]>
* based on Windows support PR triton-lang#2465 by @andreigh - triton-lang#2465 * manually applied, rebased, fix lint errors * remove '/A' platform option to use ninja * use sysconfig.get_config_var() to get the path of python*.lib * clang fix for windows * remove '-fPIC' for windows clang * fix download_and_copy() to support windows * add "exe" extension for windows * use "pyd" extension for windows to make importlib work * third_party/nvidia: fix for windows * win32 fix _path_to_binary() * add library_dir, include_dir for win32 * backend/compiler lazy remove temp files to support windows * additional works done by @mantaionut (2024/05/31) * rework for latest triton and cleanup (2024/10/14) * extract minimal fixes to support win32+clang (2024/10/16) * get exe/so extension using sysconfig (suggested by @anmyachev) see also: intel/intel-xpu-backend-for-triton#2478 Original-author-by: Andrei Gheorghe <[email protected]> Signed-off-by: Won-Kyu Park <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
85d9f42
to
a8cd2fa
Compare
third_party/nvidia/backend/driver.c
Outdated
} \ | ||
symbolName##_t funcHandle = \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we explicitly clear any existing error on Windows as well?
} \ | |
symbolName##_t funcHandle = \ | |
} \ | |
/* Clear any existing error */ \ | |
SetLastError(NO_ERROR); \ | |
symbolName##_t funcHandle = \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is needed. GetLastError is used for information purposes only (think of it as of errno
in POSIX), any next call of Win32 API should set its own error status, clear it if it is successful.
Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
third_party/intel/backend/driver.c
Outdated
@@ -194,8 +194,18 @@ static PyObject *loadBinary(PyObject *self, PyObject *args) { | |||
const size_t binary_size = PyBytes_Size(py_bytes); | |||
|
|||
uint8_t *binary_ptr = (uint8_t *)PyBytes_AsString(py_bytes); | |||
const auto ctx = | |||
sycl_device.get_platform().ext_oneapi_get_default_context(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make this available on Windows using set SYCL_ENABLE_DEFAULT_CONTEXTS=1
: https://github.com/intel/llvm/blob/7dd09d9effd6a56d2c0c7abca38b21baa4d77fc8/sycl/doc/EnvironmentVariables.md?plain=1#L22?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked this and it does work. But I am not sure it is a good way because triton on Windows will work only with this variable set to 1. Even on Linux if someone has environment SYCL_ENABLE_DEFAULT_CONTEXTS=0
we're going to throw an uncaught exception, are you sure it is ok?
third_party/intel/backend/driver.c
Outdated
try { | ||
ctx = platform.ext_oneapi_get_default_context(); | ||
} catch (const std::runtime_error &ex) { | ||
// This exception is thrown on Windows because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is iffy. If the exception is thrown ctx
will be uninitialized when used at line 212. Is there another API to use on Windows to get the context ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly unintialized. A default constructor is called which is allowed by sycl spec. And this is exactly what happens on Windows because this exception is thrown. It seems to work just fine with ctx
initialized like this.
… Windows (#2745) Part of #2478 These are quite stable changes, we can merge it without CI on Windows. @gshimansky if you don't mind. Once we test them on Nvidia UT I might try to upstream them. Signed-off-by: Anatoly Myachev <[email protected]>
env = set_env_vars(vs_path) | ||
if not vs_path: | ||
raise EnvironmentError("Visual Studio 2019 or 2022 not found.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably a good idea to define initialize_visual_studio_env
as in CLFinder.py
file here as well.
env = set_env_vars(vs_path) | |
if not vs_path: | |
raise EnvironmentError("Visual Studio 2019 or 2022 not found.") | |
if not vs_path: | |
raise EnvironmentError("Visual Studio 2019 or 2022 not found.") | |
env = set_env_vars(vs_path) |
# Check if the environment variable that vcvarsall.bat sets is present | ||
if os.environ.get('VSCMD_ARG_TGT_ARCH') != arch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point vcvarsall.bat
has not yet been called? This will only happen when calling set_env_vars
function IIUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is to verify that we're not already running in a Developer Command Prompt
environment set outside of setup.py, and if we're running, we're running in environment for the same architecture (32 or 64 bits) that we're going to build here.
set(CMAKE_C_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /Ob0 /Od /RTC1 /bigobj /Zc:preprocessor") | ||
set(CMAKE_CXX_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /Ob0 /Od /RTC1 /bigobj /Zc:preprocessor") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gshimansky these flags are for debug build, aren't?
set(CMAKE_C_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /Ob0 /Od /RTC1 /bigobj /Zc:preprocessor") | |
set(CMAKE_CXX_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /Ob0 /Od /RTC1 /bigobj /Zc:preprocessor") | |
set(CMAKE_C_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /RTC1 /bigobj /Zc:preprocessor") | |
set(CMAKE_CXX_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /RTC1 /bigobj /Zc:preprocessor") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. At the moment there is just one windows build and it is debug build mostly. We can split it into optimized and debug build like we have on Linux.
…indows (#2742) Part of #2478 (to reduce diff) These are quite stable changes, we can merge it without CI on Windows. @gshimansky if you don't mind. --------- Signed-off-by: Anatoly Myachev <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
Signed-off-by: Gregory Shimansky <[email protected]>
Fixes #2407. Current state is that code builds on windows and is able to pass many unit tests. More fixes will be done when this patch is integrated into main branch.