Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows native port #2478

Merged
merged 56 commits into from
Nov 19, 2024
Merged

Windows native port #2478

merged 56 commits into from
Nov 19, 2024

Conversation

gshimansky
Copy link
Contributor

@gshimansky gshimansky commented Oct 11, 2024

Fixes #2407. Current state is that code builds on windows and is able to pass many unit tests. More fixes will be done when this patch is integrated into main branch.

@whitneywhtsang
Copy link
Contributor

Please fix the CI failures. And can we add testing to prevent regression?

@gshimansky
Copy link
Contributor Author

Please fix the CI failures. And can we add testing to prevent regression?

Sure, we will add windows CI too but at the moment we don't have almost any windows infrastructure, and we need to figure out how to do automation. All of the bash scripts are not going to run on windows.

@etiotto etiotto requested review from a team and alexbaden October 15, 2024 14:57
@gshimansky gshimansky force-pushed the gregory/windows-support branch 2 times, most recently from d1baf6d to 45d26b3 Compare October 15, 2024 22:13
@gshimansky
Copy link
Contributor Author

CI is green on Linux so PR is ready for review now.

Copy link
Contributor

@victor-eds victor-eds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended to be upstreamed?

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
bin/RegisterTritonDialects.h Outdated Show resolved Hide resolved
python/test/unit/language/test_core.py Outdated Show resolved Hide resolved
python/test/unit/language/test_core.py Outdated Show resolved Hide resolved
python/test/unit/language/print_helper.py Outdated Show resolved Hide resolved
python/test/unit/language/test_reproducer.py Outdated Show resolved Hide resolved
python/test/unit/language/test_compile_errors.py Outdated Show resolved Hide resolved
python/triton/compiler/compiler.py Outdated Show resolved Hide resolved
python/triton/compiler/compiler.py Outdated Show resolved Hide resolved
@gshimansky gshimansky force-pushed the gregory/windows-support branch 3 times, most recently from 36d0abf to 705730c Compare October 18, 2024 14:47
python/setup.py Outdated Show resolved Hide resolved
wkpark added a commit to wkpark/triton that referenced this pull request Oct 19, 2024
 * based on Windows support PR triton-lang#2465 by @andreigh
   - triton-lang#2465
 * manually applied, rebased, fix lint errors
 * use sysconfig.get_config_var() to get the path of python*.lib
 * clang fix for windows
 * remove '-fPIC' for windows clang
 * fix download_and_copy() to support windows
 * add "exe" extension for windows
 * use "pyd" extension for windows to make importlib work
 * third_party/nvidia: fix for windows
 * win32 fix _path_to_binary()
 * add library_dir, include_dir for win32
 * backend/compiler lazy remove temp files to support windows
 * additional works done by @mantaionut (2024/05/31)
 * rework for latest triton and cleanup (2024/10/14)
 * extract minimal fixes to support win32+clang (2024/10/16)
 * get exe/so extension using sysconfig (suggested by @anmyachev)

see also:
 intel/intel-xpu-backend-for-triton#2478

Original-author-by: Andrei Gheorghe <[email protected]>
Signed-off-by: Won-Kyu Park <[email protected]>
wkpark added a commit to wkpark/triton that referenced this pull request Oct 19, 2024
 * based on Windows support PR triton-lang#2465 by @andreigh
   - triton-lang#2465
 * manually applied, rebased, fix lint errors
 * remove '/A' platform option to use ninja
 * use sysconfig.get_config_var() to get the path of python*.lib
 * clang fix for windows
 * remove '-fPIC' for windows clang
 * fix download_and_copy() to support windows
 * add "exe" extension for windows
 * use "pyd" extension for windows to make importlib work
 * third_party/nvidia: fix for windows
 * win32 fix _path_to_binary()
 * add library_dir, include_dir for win32
 * backend/compiler lazy remove temp files to support windows
 * additional works done by @mantaionut (2024/05/31)
 * rework for latest triton and cleanup (2024/10/14)
 * extract minimal fixes to support win32+clang (2024/10/16)
 * get exe/so extension using sysconfig (suggested by @anmyachev)

see also:
 intel/intel-xpu-backend-for-triton#2478

Original-author-by: Andrei Gheorghe <[email protected]>
Signed-off-by: Won-Kyu Park <[email protected]>
wkpark added a commit to wkpark/triton that referenced this pull request Oct 19, 2024
 * based on Windows support PR triton-lang#2465 by @andreigh
   - triton-lang#2465
 * manually applied, rebased, fix lint errors
 * remove '/A' platform option to use ninja
 * use sysconfig.get_config_var() to get the path of python*.lib
 * clang fix for windows
 * remove '-fPIC' for windows clang
 * fix download_and_copy() to support windows
 * add "exe" extension for windows
 * use "pyd" extension for windows to make importlib work
 * third_party/nvidia: fix for windows
 * win32 fix _path_to_binary()
 * add library_dir, include_dir for win32
 * backend/compiler lazy remove temp files to support windows
 * additional works done by @mantaionut (2024/05/31)
 * rework for latest triton and cleanup (2024/10/14)
 * extract minimal fixes to support win32+clang (2024/10/16)
 * get exe/so extension using sysconfig (suggested by @anmyachev)

see also:
 intel/intel-xpu-backend-for-triton#2478

Original-author-by: Andrei Gheorghe <[email protected]>
Signed-off-by: Won-Kyu Park <[email protected]>
wkpark added a commit to wkpark/triton that referenced this pull request Oct 19, 2024
 * based on Windows support PR triton-lang#2465 by @andreigh
   - triton-lang#2465
 * manually applied, rebased, fix lint errors
 * remove '/A' platform option to use ninja
 * use sysconfig.get_config_var() to get the path of python*.lib
 * clang fix for windows
 * remove '-fPIC' for windows clang
 * fix download_and_copy() to support windows
 * add "exe" extension for windows
 * use "pyd" extension for windows to make importlib work
 * third_party/nvidia: fix for windows
 * win32 fix _path_to_binary()
 * add library_dir, include_dir for win32
 * backend/compiler lazy remove temp files to support windows
 * additional works done by @mantaionut (2024/05/31)
 * rework for latest triton and cleanup (2024/10/14)
 * extract minimal fixes to support win32+clang (2024/10/16)
 * get exe/so extension using sysconfig (suggested by @anmyachev)

see also:
 intel/intel-xpu-backend-for-triton#2478

Original-author-by: Andrei Gheorghe <[email protected]>
Signed-off-by: Won-Kyu Park <[email protected]>
bin/CMakeLists.txt Outdated Show resolved Hide resolved
python/setup.py Outdated Show resolved Hide resolved
python/src/main.cc Outdated Show resolved Hide resolved
@gshimansky gshimansky force-pushed the gregory/windows-support branch from 85d9f42 to a8cd2fa Compare November 7, 2024 02:00
Comment on lines 181 to 182
} \
symbolName##_t funcHandle = \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we explicitly clear any existing error on Windows as well?

Suggested change
} \
symbolName##_t funcHandle = \
} \
/* Clear any existing error */ \
SetLastError(NO_ERROR); \
symbolName##_t funcHandle = \

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is needed. GetLastError is used for information purposes only (think of it as of errno in POSIX), any next call of Win32 API should set its own error status, clear it if it is successful.

@@ -194,8 +194,18 @@ static PyObject *loadBinary(PyObject *self, PyObject *args) {
const size_t binary_size = PyBytes_Size(py_bytes);

uint8_t *binary_ptr = (uint8_t *)PyBytes_AsString(py_bytes);
const auto ctx =
sycl_device.get_platform().ext_oneapi_get_default_context();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this available on Windows using set SYCL_ENABLE_DEFAULT_CONTEXTS=1: https://github.com/intel/llvm/blob/7dd09d9effd6a56d2c0c7abca38b21baa4d77fc8/sycl/doc/EnvironmentVariables.md?plain=1#L22?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this and it does work. But I am not sure it is a good way because triton on Windows will work only with this variable set to 1. Even on Linux if someone has environment SYCL_ENABLE_DEFAULT_CONTEXTS=0 we're going to throw an uncaught exception, are you sure it is ok?

@gshimansky gshimansky requested a review from etiotto November 14, 2024 18:31
@etiotto etiotto dismissed their stale review November 15, 2024 15:11

Changes upstreamed

try {
ctx = platform.ext_oneapi_get_default_context();
} catch (const std::runtime_error &ex) {
// This exception is thrown on Windows because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is iffy. If the exception is thrown ctx will be uninitialized when used at line 212. Is there another API to use on Windows to get the context ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly unintialized. A default constructor is called which is allowed by sycl spec. And this is exactly what happens on Windows because this exception is thrown. It seems to work just fine with ctx initialized like this.

anmyachev added a commit that referenced this pull request Nov 19, 2024
… Windows (#2745)

Part of #2478 

These are quite stable changes, we can merge it without CI on Windows.
@gshimansky if you don't mind. Once we test them on Nvidia UT I might
try to upstream them.

Signed-off-by: Anatoly Myachev <[email protected]>
Comment on lines +449 to +451
env = set_env_vars(vs_path)
if not vs_path:
raise EnvironmentError("Visual Studio 2019 or 2022 not found.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably a good idea to define initialize_visual_studio_env as in CLFinder.py file here as well.

Suggested change
env = set_env_vars(vs_path)
if not vs_path:
raise EnvironmentError("Visual Studio 2019 or 2022 not found.")
if not vs_path:
raise EnvironmentError("Visual Studio 2019 or 2022 not found.")
env = set_env_vars(vs_path)

Comment on lines +50 to +51
# Check if the environment variable that vcvarsall.bat sets is present
if os.environ.get('VSCMD_ARG_TGT_ARCH') != arch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point vcvarsall.bat has not yet been called? This will only happen when calling set_env_vars function IIUC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is to verify that we're not already running in a Developer Command Prompt environment set outside of setup.py, and if we're running, we're running in environment for the same architecture (32 or 64 bits) that we're going to build here.

Comment on lines +61 to +62
set(CMAKE_C_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /Ob0 /Od /RTC1 /bigobj /Zc:preprocessor")
set(CMAKE_CXX_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /Ob0 /Od /RTC1 /bigobj /Zc:preprocessor")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gshimansky these flags are for debug build, aren't?

Suggested change
set(CMAKE_C_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /Ob0 /Od /RTC1 /bigobj /Zc:preprocessor")
set(CMAKE_CXX_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /Ob0 /Od /RTC1 /bigobj /Zc:preprocessor")
set(CMAKE_C_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /RTC1 /bigobj /Zc:preprocessor")
set(CMAKE_CXX_FLAGS_TRITONRELBUILDWITHASSERTS "/Zi /RTC1 /bigobj /Zc:preprocessor")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. At the moment there is just one windows build and it is debug build mostly. We can split it into optimized and debug build like we have on Linux.

anmyachev added a commit that referenced this pull request Nov 19, 2024
…indows (#2742)

Part of #2478 (to reduce diff)

These are quite stable changes, we can merge it without CI on Windows.
@gshimansky if you don't mind.

---------

Signed-off-by: Anatoly Myachev <[email protected]>
@anmyachev anmyachev merged commit 513aede into main Nov 19, 2024
4 checks passed
@anmyachev anmyachev deleted the gregory/windows-support branch November 19, 2024 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[PoC] Windows native Triton XPU build
6 participants