Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for WebGPU #6492

Merged
merged 143 commits into from
Mar 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
15ad2be
[WebGPU] Add runtime stubs and codegen skeleton
jrprice Nov 13, 2021
3306c83
[WebGPU] Implement lazy device initialization
jrprice Nov 18, 2021
4b9b29a
[WebGPU] Implement device release
jrprice Nov 24, 2021
5bba590
[WebGPU] Add scoped error handling mechanism
jrprice Nov 24, 2021
30affd9
[WebGPU] Implement device malloc/free
jrprice Nov 24, 2021
1f45790
[WebGPU] Implement basic host<->device copies
jrprice Nov 25, 2021
e0a5287
[WebGPU] Implement halide_webgpu_device_sync
jrprice Nov 25, 2021
7a84964
[WebGPU] Implement shader compilation
jrprice Nov 25, 2021
553f648
[WebGPU] Implement core of WGSL codegen
jrprice Nov 26, 2021
e2d4e34
[WebGPU] Implement Cast node
jrprice Nov 26, 2021
9d49911
[WebGPU] Implement the float_from_bits() intrinsic
jrprice Nov 27, 2021
b0e1ca2
[WebGPU] Implement run function
jrprice Nov 28, 2021
89e1317
[WebGPU] Move queue into WgpuContext class
jrprice Nov 30, 2021
828ebe8
[WebGPU] Add support for JIT
jrprice Dec 10, 2021
6aec6f8
[WebGPU] Enable the gpu_only AOT generator test
jrprice Dec 10, 2021
42439df
[WebGPU] Add support for targeting dawn-native
jrprice Dec 10, 2021
d98e101
[WebGPU] Add support for AOT tests when using WASM
jrprice Dec 10, 2021
4e1c1f4
[WebGPU] Print explicit types for let declarations
jrprice Dec 11, 2021
622e13b
[WebGPU] Address first round of review comments
jrprice Dec 11, 2021
8cdbf50
[WebGPU] Add copy of webgpu.h
jrprice Dec 13, 2021
04ecd93
[WebGPU] Add comment about ASYNCIFY requirement
jrprice Dec 13, 2021
56b5326
[WebGPU] Remove -Wno-atomic-alignment
jrprice Dec 13, 2021
f36599d
pacify clang-tidy
steven-johnson Dec 15, 2021
8f76c70
Merge branch 'master' into pr/6492
steven-johnson Dec 15, 2021
399247f
Fix more clang-tidy errors
jrprice Dec 15, 2021
38d12dd
Only use ASYNCIFY for tests when targeting WebGPU
jrprice Dec 15, 2021
00982bc
Fix even more clang-tidy errors
jrprice Dec 15, 2021
a63cfef
[WebGPU] Add basic support to Makefile
jrprice Dec 15, 2021
74d7e28
Merge branch 'master' into pr/6492
steven-johnson Dec 16, 2021
2b50a91
[WebGPU] Don't wrap buffers inside structures
jrprice Jan 19, 2022
f75a776
[WebGPU] Fix debug message tag
jrprice Jan 19, 2022
ee51a14
[WebGPU] Update WGPUErrorFilter enum in header
jrprice Jan 19, 2022
f1e7b35
[WebGPU] Update WGSL attribute syntax
jrprice Jan 19, 2022
03e809b
[WebGPU] Add README_webgpu.md
jrprice Jan 20, 2022
1e857db
[WebGPU] Move native WebGPU library CMake logic
jrprice Jan 20, 2022
1520882
Merge remote-tracking branch 'origin' into webgpu
jrprice Jan 21, 2022
1f5816e
Merge branch 'master' into pr/6492
steven-johnson Jan 27, 2022
524eb29
[WebGPU] Implement WGSL codegen for serial loops
jrprice Jan 27, 2022
3516aa3
[WebGPU] Implement WGSL codegen for Allocate
jrprice Jan 27, 2022
79bc028
[WebGPU] Implement WGSL codegen for Select
jrprice Jan 27, 2022
d4b676d
[WebGPU] Mark 64-bit types as unsupported
jrprice Jan 27, 2022
9227e27
[WebGPU] Implement device_and_host_{malloc,free}
jrprice Jan 27, 2022
638b5c2
[WebGPU] Fixed WGSL codegen for boolean vectors
jrprice Jan 27, 2022
eb5ecc1
[WebGPU] Implement f32 math intrinsics
jrprice Jan 27, 2022
86daf1c
[WebGPU] Implement inverse and inverse sqrt
jrprice Jan 27, 2022
d31a833
[WebGPU] Fixup some errors in WGSL codegen
jrprice Jan 27, 2022
7db2e96
[WebGPU] Implement logical and/or for bool vectors
jrprice Feb 3, 2022
df8e32e
[WebGPU] Implement WGSL codegen for Broadcast node
jrprice Feb 3, 2022
b955210
[WebGPU] Implement WGSL codegen for Ramp node
jrprice Feb 3, 2022
3a82238
[WebGPU] Emulate 8- and 16-bit integers
jrprice Feb 4, 2022
a9bd3eb
[WebGPU] Avoid buffer name collisions
jrprice Feb 4, 2022
f40d399
[WebGPU] Fix divide-by-power-of-two
jrprice Feb 4, 2022
95e1542
Merge remote-tracking branch 'origin/master' into webgpu
jrprice Feb 5, 2022
7261e09
[WebGPU] Implement codegen for gpu_thread_barrier
jrprice Feb 10, 2022
24f7ddb
[WebGPU] Implement WGSL codegen for Evaluate
jrprice Feb 10, 2022
528353c
[WebGPU] Add support for shared memory
jrprice Feb 10, 2022
5681c41
[WebGPU] Fix 8/16-bit load/store emulation
jrprice Feb 11, 2022
968ca77
Use generic 64-bit support query in gpu_mixed_shared_mem_types
jrprice Feb 11, 2022
58167d0
Merge remote-tracking branch 'origin/master' into webgpu
jrprice Feb 11, 2022
2809333
[WebGPU] Fix object cleanup during teardown
jrprice Feb 14, 2022
950bc91
[WebGPU] Do not re-allocate device buffers
jrprice Feb 15, 2022
e3fba86
[WebGPU] Mark maximum vector width as 4 in tests
jrprice Feb 15, 2022
03230c1
[WebGPU] Add functions to object lifetime tracker
jrprice Feb 15, 2022
faecb5f
[WebGPU] Override FloatImm handling
jrprice Feb 16, 2022
522cc8a
[WebGPU] Scalarize predicated loads
jrprice Feb 16, 2022
611f9ee
[WebGPU] Implement if_then_else intrinsic
jrprice Feb 16, 2022
0952169
[WebGPU] Enable gpu_free_sync test
jrprice Feb 16, 2022
1744cd4
[WebGPU] Implement print_reinterpret
jrprice Mar 1, 2022
a1a5f8b
[WebGPU] Implement print_extern_call
jrprice Mar 1, 2022
6603f0d
Merge remote-tracking branch 'origin/main' into webgpu
jrprice Mar 1, 2022
8fdd264
Add missing include and namespace for isnan/isinf
jrprice Mar 1, 2022
6ab437d
[WebGPU] Avoid short-circuiting operators
jrprice May 3, 2022
de4239d
[WebGPU] Use commas for struct member seperators
jrprice May 3, 2022
0fed8e2
[WebGPU] Update API headers and usages
jrprice May 13, 2022
b7dfef2
Merge remote-tracking branch 'origin/main' into webgpu
jrprice May 13, 2022
f45cad2
[WebGPU] Use CodeGen_GPU_C base class for codegen
jrprice May 13, 2022
b35b3ff
[WebGPU] Add warning for emulated narrow integers
jrprice May 16, 2022
fbed200
[WebGPU] Update README with latest status
jrprice May 17, 2022
1b57ac2
[WebGPU] Add support for non-contiguous copies
jrprice May 20, 2022
f4b92bf
[WebGPU] Fix clang-tidy error
jrprice May 20, 2022
c720dcf
[WebGPU] Use atomicCmpXchg for 8/16-bit emulation
jrprice May 31, 2022
32ef7ad
[WebGPU] Support non-32-bit parameter types
jrprice Jun 2, 2022
f529e95
[WebGPU] Fix mixed types in buffers
jrprice Jun 7, 2022
6e84013
Merge remote-tracking branch 'origin/main' into webgpu
jrprice Jun 7, 2022
bf3f15e
[WebGPU] Show validation errors for failed maps
jrprice Jun 7, 2022
e02cb60
[WebGPU] Round up buffer offsets and sizes
jrprice Jun 7, 2022
0f4f2da
[WebGPU] Update implementation status in README
jrprice Jun 7, 2022
5c39365
[WebGPU] Replace @stage(compute) with @compute
jrprice Jun 7, 2022
87a552d
[WebGPU] Polyfill the pow_f32 intrinsic
jrprice Jun 8, 2022
e25975e
[WebGPU] Skip the gpu_allocation_cache test
jrprice Jun 8, 2022
f321a61
[WebGPU] Use builtins for inverse hyperbolics
jrprice Oct 26, 2022
efe9658
Merge remote-tracking branch 'origin/main' into webgpu
jrprice Oct 26, 2022
9e5ced1
[WebGPU] Map rint() to round()
jrprice Oct 26, 2022
27bc2da
[WebGPU] Set device lost callback
jrprice Oct 27, 2022
3ac7d4e
[WebGPU] Use i32 for bool parameters
jrprice Oct 28, 2022
7805295
[WebGPU] Raise limits for buffer size and workgroup storage
jrprice Oct 31, 2022
dd99307
[WebGPU] Update mini_webgpu.h
jrprice Oct 31, 2022
b2db5a6
[WebGPU] Add support for dynamic workgroups
jrprice Oct 31, 2022
4683e71
[WebGPU] Avoid using 'new' as an identifier
jrprice Nov 18, 2022
dd1537a
[WebGPU] Do not merge workgroup allocations
jrprice Nov 18, 2022
02b1b24
Merge remote-tracking branch 'origin/main' into webgpu
jrprice Nov 18, 2022
5778c9d
[WebGPU] Fix Emscripten support
jrprice Nov 21, 2022
4023898
[WebGPU] Use const for integer immediates
jrprice Nov 22, 2022
f457041
[WebGPU] Squelch clang-tidy error
jrprice Nov 22, 2022
c5fd471
[WebGPU] Note Dawn's dependency on go in README
jrprice Dec 20, 2022
adf4e90
[WebGPU] Add links to Emscripten vs Dawn issue
jrprice Dec 20, 2022
0df3b1d
[WebGPU] Show error finding WebGPU library fails
jrprice Dec 20, 2022
03f03d6
[WebGPU] Add link to issue about Windows support
jrprice Dec 20, 2022
b89f8c4
[WebGPU] Rename roundUpToMultipleOf4
jrprice Dec 20, 2022
8f889ff
[WebGPU] Add links to wrap_native issue
jrprice Dec 20, 2022
7c2b11e
[WebGPU] Use debug_assert for some runtime errors
jrprice Dec 20, 2022
45c5168
[WebGPU] Stop using designated initializers
jrprice Dec 20, 2022
ffe6966
Merge remote-tracking branch 'origin/main' into webgpu
jrprice Dec 20, 2022
ff93f51
[WebGPU] Update mini_webgpu.h
jrprice Dec 20, 2022
e878c97
[WebGPU] Fix a validation issue with ToT Dawn
jrprice Dec 22, 2022
aeff752
Merge branch 'main' into pr/6492
steven-johnson Feb 22, 2023
47b6af6
Update README_webgpu.md
steven-johnson Feb 23, 2023
eec01ff
Add is_finite_f32, is_inf_f32, is_nan_f32
steven-johnson Feb 23, 2023
f43b28e
Update isinf, isnan, isfinite; add inf_f32, neg_inf_f32
steven-johnson Feb 23, 2023
52b2010
correctness_isnan should be skipped for WebGPU
steven-johnson Feb 23, 2023
2b83202
Update isnan.cpp
steven-johnson Feb 23, 2023
aa056ce
Update atomics_gpu_8_bit.cpp
steven-johnson Feb 23, 2023
7e98b6f
Merge branch 'main' into pr/6492
steven-johnson Feb 23, 2023
7bb4f30
Fix python_tutorial_lesson_10_aot_compilation_run
steven-johnson Feb 23, 2023
4a6ec34
Merge branch 'main' into pr/6492
steven-johnson Feb 23, 2023
e5fdbe3
Partial fix for generator_aot_acquire_release()
steven-johnson Feb 24, 2023
c20aa66
Merge remote-tracking branch 'origin/main' into webgpu
jrprice Feb 27, 2023
7c415ef
[WebGPU] Fix AOT test build for non-WASM
jrprice Feb 27, 2023
39bf455
[WebGPU] Move README to root
jrprice Feb 28, 2023
0dfc729
[WebGPU] Address review comments
jrprice Feb 28, 2023
c1f1bbb
Merge branch 'main' into pr/6492
steven-johnson Mar 1, 2023
aea8397
wip
steven-johnson Mar 1, 2023
c06fd64
Update CMakeLists.txt
steven-johnson Mar 1, 2023
26ef6d7
Work-in-progress for generator_aot_gpu_multi_context_threaded
steven-johnson Feb 24, 2023
2e88fe2
[WebGPU] Use a per-context staging buffer
jrprice Mar 1, 2023
951efb1
[WebGPU] Fix clang-format issue
jrprice Mar 1, 2023
0673e67
Merge branch 'main' into pr/6492
steven-johnson Mar 2, 2023
742db3f
[WebGPU] Move staging buffer creation
jrprice Mar 2, 2023
9d79ac6
Merge branch 'webgpu' of https://github.com/jrprice/Halide into pr/6492
steven-johnson Mar 2, 2023
2193614
Revert "Merge branch 'webgpu' of https://github.com/jrprice/Halide in…
steven-johnson Mar 2, 2023
7535579
Skip correctness_multi_way_select on x86 Macs (for now)
steven-johnson Mar 2, 2023
ec5c14e
clang-format
steven-johnson Mar 2, 2023
236fff7
[WebGPU] Recommit 742db3feb888394ff6529c357da3f32ae9286ea4
jrprice Mar 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .clang-format-ignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,6 @@
./tutorial
# hexagon_remote/bin/src is also special
./src/runtime/hexagon_remote/bin/src
# mini_webgpu.h is copied from upstream with some local mods
./src/runtime/mini_webgpu.h
./dependencies/spirv
8 changes: 8 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ WITH_OPENCL ?= not-empty
WITH_METAL ?= not-empty
WITH_OPENGLCOMPUTE ?= not-empty
WITH_D3D12 ?= not-empty
WITH_WEBGPU ?= not-empty
WITH_INTROSPECTION ?= not-empty
WITH_EXCEPTIONS ?=
WITH_LLVM_INSIDE_SHARED_LIBHALIDE ?= not-empty
Expand Down Expand Up @@ -161,6 +162,8 @@ OPENGLCOMPUTE_CXX_FLAGS=$(if $(WITH_OPENGLCOMPUTE), -DWITH_OPENGLCOMPUTE, )
D3D12_CXX_FLAGS=$(if $(WITH_D3D12), -DWITH_D3D12, )
D3D12_LLVM_CONFIG_LIB=$(if $(WITH_D3D12), , )

WEBGPU_CXX_FLAGS=$(if $(WITH_WEBGPU), -DWITH_WEBGPU, )

AARCH64_CXX_FLAGS=$(if $(WITH_AARCH64), -DWITH_AARCH64, )
AARCH64_LLVM_CONFIG_LIB=$(if $(WITH_AARCH64), aarch64, )

Expand Down Expand Up @@ -206,6 +209,7 @@ CXX_FLAGS += $(OPENCL_CXX_FLAGS)
CXX_FLAGS += $(METAL_CXX_FLAGS)
CXX_FLAGS += $(OPENGLCOMPUTE_CXX_FLAGS)
CXX_FLAGS += $(D3D12_CXX_FLAGS)
CXX_FLAGS += $(WEBGPU_CXX_FLAGS)
CXX_FLAGS += $(POWERPC_CXX_FLAGS)
CXX_FLAGS += $(INTROSPECTION_CXX_FLAGS)
CXX_FLAGS += $(EXCEPTIONS_CXX_FLAGS)
Expand Down Expand Up @@ -436,6 +440,7 @@ SOURCE_FILES = \
CodeGen_PyTorch.cpp \
CodeGen_RISCV.cpp \
CodeGen_WebAssembly.cpp \
CodeGen_WebGPU_Dev.cpp \
CodeGen_X86.cpp \
CompilerLogger.cpp \
CPlusPlusMangle.cpp \
Expand Down Expand Up @@ -613,6 +618,7 @@ HEADER_FILES = \
CodeGen_PTX_Dev.h \
CodeGen_PyTorch.h \
CodeGen_Targets.h \
CodeGen_WebGPU_Dev.h \
CompilerLogger.h \
ConciseCasts.h \
CPlusPlusMangle.h \
Expand Down Expand Up @@ -824,6 +830,7 @@ RUNTIME_CPP_COMPONENTS = \
trace_helper \
tracing \
wasm_cpu_features \
webgpu \
windows_clock \
windows_cuda \
windows_d3d12compute_arm \
Expand Down Expand Up @@ -864,6 +871,7 @@ RUNTIME_EXPORTED_INCLUDES = $(INCLUDE_DIR)/HalideRuntime.h \
$(INCLUDE_DIR)/HalideRuntimeOpenGLCompute.h \
$(INCLUDE_DIR)/HalideRuntimeMetal.h \
$(INCLUDE_DIR)/HalideRuntimeQurt.h \
$(INCLUDE_DIR)/HalideRuntimeWebGPU.h \
$(INCLUDE_DIR)/HalideBuffer.h \
$(INCLUDE_DIR)/HalidePyTorchHelpers.h \
$(INCLUDE_DIR)/HalidePyTorchCudaHelpers.h
Expand Down
102 changes: 102 additions & 0 deletions README_webgpu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# WebGPU support for Halide

Halide has work-in-progress support for generating and running WebGPU shaders.
This can be used in conjunction with the WebAssembly backend to bring
GPU-accelerated Halide pipelines to the web.

As the first version of the WebGPU standard is itself still being developed,
Halide's support has some limitations and may only work with certain browsers
and versions of Emscripten.

## Known limitations

The following is a non-comprehensive list of known limitations:

- Only 32-bit integers and floats have efficient support.
* 8-bit and 16-bit integers are implemented using emulation. Future
extensions to WGSL will allow them to be implemented more efficiently.
* 64-bit integers and floats will likely remain unsupported until WGSL gains
extensions to support them.
- Wrapping native device buffer handles is not yet implemented.

In addition to these functional limitations, the performance of the WebGPU
backend has not yet been evaluated, and so optimizations in the runtime or
device codegen may be required before it becomes profitable to use.

## Running with WebAssembly via Emscripten: `HL_TARGET=wasm-32-wasmrt-webgpu`

> _Tested with top-of-tree Emscripten as of 2023-02-23, against Chrome v113._

Halide can generate WebGPU code that can be integrated with WASM code using
Emscripten.
Halide must currently be built *without* the `WEBGPU_NATIVE_LIB` flag when
targeting Emscripten.

When invoking `emcc` to link Halide-generated objects, include these flags:
`-s USE_WEBGPU=1 -s ASYNCIFY`.

Tests that use AOT compilation can be run using a native WebGPU implementation
that has Node.js bindings, such as [Dawn](dawn.googlesource.com/dawn/).
When configuring Halide, use `-DWEBGPU_NODE_BINDINGS=/path/to/dawn.node` to
enable these tests.
See [below](#setting-up-dawn) for instructions on building the Dawn Node.js
bindings.

JIT compilation is not supported when using WebGPU with WASM.

## Running natively: `HL_TARGET=host-webgpu`

> _Tested with top-of-tree Dawn as of 2023-02-28._

For testing purposes, Halide can also target native WebGPU libraries, such as
[Dawn](dawn.googlesource.com/dawn/) or [wgpu](github.com/gfx-rs/wgpu).
This is currently the only path that can run the JIT correctness tests.
See [below](#setting-up-dawn) for instructions on building Dawn.

Due to differences between the APIs implemented by native WebGPU libraries and
Emscripten, this currently requires a separate build of Halide.
Pass `-DWEBGPU_NATIVE_LIB=/path/to/native/library.{so,dylib.dll}` to CMake when
configuring Halide to enable this path, which will automatically use this
library for the AOT and JIT tests.

## Setting up Dawn

Building Dawn's Node.js bindings currently requires using CMake.

First, [install `depot_tools`](commondatastorage.googleapis.com/chrome-infra-docs/flat/depot_tools/docs/html/depot_tools_tutorial.html#_setting_up) and add it to the
`PATH` environment variable.

Next, get Dawn and its dependencies:

# Clone the repo
git clone https://dawn.googlesource.com/dawn
cd dawn

# Bootstrap the gclient configuration with Node.js bindings enabled
cp scripts/standalone-with-node.gclient .gclient

# Fetch external dependencies and toolchains with gclient
gclient sync

# Other dependencies that must be installed manually:
# - golang

Finally, build Dawn, enabling both the Node.js bindings and shared libraries:

mkdir -p <build_dir>
cd <build_dir>

cmake <dawn_root_dir> -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DDAWN_BUILD_NODE_BINDINGS=1 \
-DDAWN_ENABLE_PIC=1 \
-DBUILD_SHARED_LIBS=ON

ninja dawn.node webgpu_dawn

This will produce the following artifacts:
- Node.js bindings: `<build_dir>/dawn.node`
- Native library: `<build_dir>/src/dawn/native/libwebgpu_dawn.{so,dylib,dll}`

These paths can then be used for the `-DWEBGPU_NODE_BINDINGS` and
`-DWEBGPU_NATIVE_LIB` CMake options when configuring Halide.
6 changes: 6 additions & 0 deletions cmake/HalideGeneratorHelpers.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -716,6 +716,12 @@ function(_Halide_target_link_gpu_libs TARGET VISIBILITY)
find_library(METAL_LIBRARY Metal REQUIRED)
target_link_libraries(${TARGET} ${VISIBILITY} "${FOUNDATION_LIBRARY}" "${METAL_LIBRARY}")
endif ()

if ("${ARGN}" MATCHES "webgpu")
if (WEBGPU_NATIVE_LIB)
target_link_libraries(${TARGET} PRIVATE ${WEBGPU_NATIVE_LIB})
endif ()
endif ()
endfunction()

##
Expand Down
28 changes: 26 additions & 2 deletions dependencies/wasm/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ endif()
function(add_wasm_executable TARGET)
set(options)
set(oneValueArgs)
set(multiValueArgs SRCS DEPS INCLUDES ENABLE_IF)
set(multiValueArgs SRCS DEPS INCLUDES OPTIONS ENABLE_IF)
cmake_parse_arguments(args "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})

if (args_ENABLE_IF AND NOT (${args_ENABLE_IF}))
Expand Down Expand Up @@ -107,8 +107,23 @@ function(add_wasm_executable TARGET)
-s ALLOW_MEMORY_GROWTH=1
-s ENVIRONMENT=node
-s STACK_SIZE=98304
${args_OPTIONS}
)

if ("${Halide_TARGET}" MATCHES "webgpu")
set(EMCC_FLAGS
${EMCC_FLAGS}
-s USE_WEBGPU=1
-s ASYNCIFY
)
endif ()

# TODO: Remove this when Emscripten and Dawn implement the same APIs.
jrprice marked this conversation as resolved.
Show resolved Hide resolved
# See https://github.com/halide/Halide/issues/7248
if (WEBGPU_NATIVE_LIB)
message(FATAL_ERROR "Cannot generate WASM executables when targeting native WebGPU implementations.")
endif ()

set(SRCS)
foreach (S IN LISTS args_SRCS)
list(APPEND SRCS "${CMAKE_CURRENT_SOURCE_DIR}/${S}")
Expand Down Expand Up @@ -144,9 +159,18 @@ function(add_wasm_halide_test TARGET)
return()
endif ()

set(script "require('./${TARGET}.js')")
if (WEBGPU_NODE_BINDINGS)
set(script "\
const provider = require('${WEBGPU_NODE_BINDINGS}')\n\
const gpu = provider.create([])\n\
const navigator = { gpu: gpu }\n
${script}")
endif ()

add_halide_test("${TARGET}"
GROUPS ${args_GROUPS}
COMMAND ${NODE_JS_EXECUTABLE} "${TARGET}.js")
COMMAND ${NODE_JS_EXECUTABLE} "-e" "${script}")
endfunction()

function(find_node_js)
Expand Down
5 changes: 5 additions & 0 deletions python_bindings/test/correctness/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
if (TARGET_WEBASSEMBLY AND Halide_TARGET MATCHES "webgpu")
message(WARNING "Python correctness tests are not supported with WebGPU.")
return()
endif ()

add_library(the_sort_function MODULE the_sort_function.c)
target_link_libraries(the_sort_function PRIVATE Halide::Runtime)

Expand Down
10 changes: 10 additions & 0 deletions python_bindings/tutorial/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
if (TARGET_WEBASSEMBLY AND Halide_TARGET MATCHES "webgpu")
message(WARNING "Python tutorials are not supported with WebGPU.")
return()
endif ()

set(tests
lesson_01_basics.py
lesson_02_input_image.py
Expand Down Expand Up @@ -58,6 +63,11 @@ else ()

target_link_libraries(lesson_10_halide PRIVATE Halide::Runtime)

# Undocumented function in HalideGeneratorHelpers. Do not call in external code.
# Users of the AOT functions (as opposed to Generators) should link to the relevant
# GPU libraries manually.
_Halide_target_link_gpu_libs(lesson_10_halide PRIVATE ${Halide_TARGET})

# The fixture "py_lesson_10" orchestrates running the generator part of the lesson first, then the build for the
# library, and finally runs python_tutorial_lesson_10_aot_compilation_run. The ..._compile test invokes CMake on
# the current build for the above library.
Expand Down
14 changes: 14 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ set(HEADER_FILES
CodeGen_PTX_Dev.h
CodeGen_PyTorch.h
CodeGen_Targets.h
CodeGen_WebGPU_Dev.h
CompilerLogger.h
ConciseCasts.h
CPlusPlusMangle.h
Expand Down Expand Up @@ -206,6 +207,7 @@ set(SOURCE_FILES
CodeGen_PyTorch.cpp
CodeGen_RISCV.cpp
CodeGen_WebAssembly.cpp
CodeGen_WebGPU_Dev.cpp
CodeGen_X86.cpp
CompilerLogger.cpp
CPlusPlusMangle.cpp
Expand Down Expand Up @@ -527,6 +529,18 @@ if (TARGET_OPENGLCOMPUTE)
target_compile_definitions(Halide PRIVATE WITH_OPENGLCOMPUTE)
endif ()

option(TARGET_WEBGPU "Include WebGPU target" ON)
if (TARGET_WEBGPU)
target_compile_definitions(Halide PRIVATE WITH_WEBGPU)
set(WEBGPU_NATIVE_LIB "" CACHE STRING
"WebGPU native library to link against")
if (WEBGPU_NATIVE_LIB)
set_property(SOURCE JITModule.cpp PROPERTY COMPILE_DEFINITIONS
WEBGPU_NATIVE_LIB=\"${WEBGPU_NATIVE_LIB}\"
APPEND_STRING)
endif ()
endif()

if (TARGET_SPIRV)
# Our vendored SPIRV headers are only used internally; users do not need
# them installed.
Expand Down
4 changes: 4 additions & 0 deletions src/CodeGen_C.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ extern "C" unsigned char halide_internal_runtime_header_HalideRuntimeOpenCL_h[];
extern "C" unsigned char halide_internal_runtime_header_HalideRuntimeOpenGLCompute_h[];
extern "C" unsigned char halide_internal_runtime_header_HalideRuntimeQurt_h[];
extern "C" unsigned char halide_internal_runtime_header_HalideRuntimeD3D12Compute_h[];
extern "C" unsigned char halide_internal_runtime_header_HalideRuntimeWebGPU_h[];

namespace {

Expand Down Expand Up @@ -503,6 +504,9 @@ CodeGen_C::~CodeGen_C() {
if (target.has_feature(Target::D3D12Compute)) {
stream << halide_internal_runtime_header_HalideRuntimeD3D12Compute_h << "\n";
}
if (target.has_feature(Target::WebGPU)) {
stream << halide_internal_runtime_header_HalideRuntimeWebGPU_h << "\n";
}
}
stream << "#endif\n";
}
Expand Down
4 changes: 4 additions & 0 deletions src/CodeGen_GPU_Dev.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,8 @@ void CodeGen_GPU_C::visit(const Shuffle *op) {
std::string storage_name = unique_name('_');
if (vector_declaration_style == VectorDeclarationStyle::OpenCLSyntax) {
rhs << "(" << print_type(op->type) << ")(";
} else if (vector_declaration_style == VectorDeclarationStyle::WGSLSyntax) {
rhs << print_type(op->type) << "(";
} else {
rhs << "{";
}
Expand All @@ -192,6 +194,8 @@ void CodeGen_GPU_C::visit(const Shuffle *op) {
}
if (vector_declaration_style == VectorDeclarationStyle::OpenCLSyntax) {
rhs << ")";
} else if (vector_declaration_style == VectorDeclarationStyle::WGSLSyntax) {
rhs << ")";
} else {
rhs << "}";
}
Expand Down
5 changes: 3 additions & 2 deletions src/CodeGen_GPU_Dev.h
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,12 @@ struct CodeGen_GPU_Dev {
* GPU backends derive from and specialize this class. */
class CodeGen_GPU_C : public CodeGen_C {
public:
/** OpenCL uses a different syntax than C for immediate vectors. This
/** OpenCL and WGSL use different syntax than C for immediate vectors. This
enum defines which style should be used by the backend. */
enum class VectorDeclarationStyle {
CLikeSyntax = 0,
OpenCLSyntax = 1
OpenCLSyntax = 1,
WGSLSyntax = 2,
};

CodeGen_GPU_C(std::ostream &s, Target t)
Expand Down
2 changes: 2 additions & 0 deletions src/CodeGen_Internal.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ bool function_takes_user_context(const std::string &name) {
"halide_openglcompute_run",
"halide_metal_run",
"halide_d3d12compute_run",
"halide_webgpu_run",
"halide_msan_annotate_buffer_is_initialized_as_destructor",
"halide_msan_annotate_buffer_is_initialized",
"halide_msan_annotate_memory_is_initialized",
Expand All @@ -91,6 +92,7 @@ bool function_takes_user_context(const std::string &name) {
"halide_openglcompute_initialize_kernels",
"halide_metal_initialize_kernels",
"halide_d3d12compute_initialize_kernels",
"halide_webgpu_initialize_kernels",
"halide_get_gpu_device",
"_halide_buffer_crop",
"_halide_buffer_retire_crop_after_extern_stage",
Expand Down
Loading