[SYCL][Fusion] JIT compiler kernel fusion passes #7661

sommerlukas · 2022-12-06T16:48:00Z

This is the fourth patch in a series of patches to add an implementation of the kernel fusion extension. We have split the implementation into multiple patches to make them more easy to review.

This patch adds the LLVM passes that perform the kernel fusion and related optimizations:

A pass creating the function definition for the fused kernel from the input kernel definitions.
A pass performing internalization of dataflow internal to the fused kernel into either private or local memory.
The type of memory to use is currently specified by the user in the runtime.
A pass propagating values for scalars and by-val aggregates from the SYCL runtime to the fused kernel as constants.

The information is propagated from the SYCL runtime to the passes via LLVM metadata inserted by the JIT compiler frontend.

After and between the fusion passes, some standard LLVM optimization and transformation passes are executed to enable passes and optimize the fused kernel.

Co-authored-by: Lukas Sommer [email protected]
Co-authored-by: Victor Perez [email protected]
Signed-off-by: Lukas Sommer [email protected]

sommerlukas · 2022-12-06T16:49:07Z

This patch can be reviewed and merged independently of #7531.

The tests are currently not yet run together with check-sycl, this will follow in a later patch/PR. They can be executed from the build folder by running check-sycl-fusion.

Naghasan

LGTM, just the nitpick

sycl-fusion/passes/kernel-fusion/SYCLKernelFusion.cpp

Co-authored-by: Lukas Sommer <[email protected]> Co-authored-by: Victor Perez <[email protected]> Signed-off-by: Lukas Sommer <[email protected]>

Naghasan · 2022-12-08T08:56:18Z

@intel/llvm-gatekeepers the PR is ready to be merged, AMD failure is unrelated to this patch AFAIK

steffenlarsen · 2022-12-08T08:58:21Z

HIP failures reported in #7634.

AlexeySachkov

Some post-commit review

AlexeySachkov · 2022-12-08T10:27:01Z

sycl-fusion/passes/cleanup/Cleanup.cpp

+  NF->copyAttributesFrom(F);
+  // Drop masked-out attributes.
+  SmallVector<AttributeSet> Attributes;
+  const llvm::AttributeList PAL = NF->getAttributes();


nit: llvm:: namespace is used inconsistently: there is using namespace llvm;, but in some places llvm:: is still used. For example, below on line 44 AttributeList is referenced without llvm::. This comment also applies to other files

AlexeySachkov · 2022-12-08T10:30:25Z

sycl-fusion/passes/cleanup/Cleanup.cpp

+
+  {
+    // Copy metadata.
+    SmallVector<std::pair<unsigned, MDNode *>> MDs;


I'm a bit surprised that SmallVector doesn't require a second argument, am I missing something? What is the point of using SmallVector if we are not pre-allocating some storage on stack?

This is actually recommended if there is no well motivated choice for N:

In the absence of a well-motivated choice for the number of inlined elements N, it is recommended to use SmallVector (that is, omitting the N). This will choose a default number of inlined elements reasonable for allocation on the stack (for example, trying to keep sizeof(SmallVector) around 64 bytes).

From https://llvm.org/doxygen/classllvm_1_1SmallVector.html#details

Didn't know there is a default value for N, thanks!

AlexeySachkov · 2022-12-08T10:30:41Z

sycl-fusion/passes/cleanup/Cleanup.cpp

+    // Copy metadata.
+    SmallVector<std::pair<unsigned, MDNode *>> MDs;
+    F->getAllMetadata(MDs);
+    for (auto MD : MDs) {


Unnecessary copy due to missing &?

AlexeySachkov · 2022-12-08T10:31:10Z

sycl-fusion/passes/cleanup/Cleanup.cpp

+
+using namespace llvm;
+
+static FunctionType *createMaskedFunctionType(const BitVector &Mask,


Why not anonymous namespace?

My understanding so far was that there doesn't seem to be a big difference between static and anonymous namespaces, also based on this.

Do you have a specific reason to prefer one over the other?

No strong reason, just making it more C++-y

Using static in this case allows us to keep the static functions closer to their non-static users/callers without declaring multiple anonymous namespaces, so we stick with that for now.

AlexeySachkov · 2022-12-08T10:34:49Z

sycl-fusion/passes/internalization/Internalization.cpp

+  auto *NewIndex = [&]() -> Value * {
+    if (LocalSize == 1) {
+      return Builder.getInt64(0);
+    } else {


LLVM Coding Standard: Don't use else after return

AlexeySachkov · 2022-12-08T10:41:26Z

sycl-fusion/passes/kernel-fusion/SYCLKernelFusion.cpp

+    auto *FK = dyn_cast<MDString>(MDOp.get());
+    assert(FK && "Kernel should be given by its name as MDString");


No need to use dyn_cast if you don't check the result. cast already contains an assert statement for you

AlexeySachkov · 2022-12-08T10:42:02Z

sycl-fusion/passes/kernel-fusion/SYCLKernelFusion.cpp

+  if (StubFunction.hasMetadata(ParameterMDKind)) {
+    llvm::MDNode *ParamMD = StubFunction.getMetadata(ParameterMDKind);
+    for (const auto &Op : ParamMD->operands()) {
+      auto *Tuple = dyn_cast<MDNode>(Op.get());


Once again, dyn_cast -> cast

AlexeySachkov · 2022-12-08T10:43:11Z

sycl-fusion/passes/kernel-fusion/SYCLKernelFusion.cpp

+  auto *ConstantMD = dyn_cast<ConstantAsMetadata>(MD);
+  assert(ConstantMD && "Metadata not constant");


dyn_cast -> cast

AlexeySachkov · 2022-12-08T10:49:52Z

sycl-fusion/passes/syclcp/SYCLCP.cpp

+  SmallVector<Value *> Indices;
+  Indices.push_back(Builder.getInt32(0));


ArrayRef is constructible from a single element: we should use that instead of performing an allocation on heap for a single pointer.

AlexeySachkov · 2022-12-08T10:51:00Z

sycl-fusion/passes/syclcp/SYCLCP.cpp

+  if (RetCode) {
+    return std::move(RetCode);
+  }


Braces around single-line ifs can be omitted

sommerlukas · 2022-12-09T11:46:03Z

@AlexeySachkov: Thank you for your feedback, I addressed your post-commit comments in #7719.

@AlexeySachkov

Fix tests for kernel fusion passes in static builds by using correct library, loadable by `opt`, for tests. Also fixes post-commit feedback from @AlexeySachkov in #7661. Signed-off-by: Lukas Sommer <[email protected]>

sommerlukas requested review from victor-eds and Naghasan as code owners December 6, 2022 16:48

sommerlukas self-assigned this Dec 6, 2022

sommerlukas force-pushed the kernel-fusion/fourth-patch branch from 5e1613a to 5451db0 Compare December 6, 2022 18:09

Naghasan approved these changes Dec 7, 2022

View reviewed changes

sycl-fusion/passes/kernel-fusion/SYCLKernelFusion.cpp Outdated Show resolved Hide resolved

sommerlukas force-pushed the kernel-fusion/fourth-patch branch from 5451db0 to fae8208 Compare December 7, 2022 09:56

victor-eds approved these changes Dec 7, 2022

View reviewed changes

[SYCL][Fusion] JIT compiler kernel fusion passes

2021ae2

Co-authored-by: Lukas Sommer <[email protected]> Co-authored-by: Victor Perez <[email protected]> Signed-off-by: Lukas Sommer <[email protected]>

sommerlukas force-pushed the kernel-fusion/fourth-patch branch from fae8208 to 2021ae2 Compare December 7, 2022 17:33

steffenlarsen merged commit e1e6df5 into intel:sycl Dec 8, 2022

AlexeySachkov reviewed Dec 8, 2022

View reviewed changes

sommerlukas mentioned this pull request Dec 9, 2022

[SYCL][Fusion] Fix kernel fusion passes tests #7719

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][Fusion] JIT compiler kernel fusion passes #7661

[SYCL][Fusion] JIT compiler kernel fusion passes #7661

sommerlukas commented Dec 6, 2022 •

edited

Loading

sommerlukas commented Dec 6, 2022

Naghasan left a comment

Naghasan commented Dec 8, 2022

steffenlarsen commented Dec 8, 2022

AlexeySachkov left a comment

AlexeySachkov Dec 8, 2022

AlexeySachkov Dec 8, 2022

sommerlukas Dec 8, 2022

AlexeySachkov Dec 8, 2022

AlexeySachkov Dec 8, 2022

AlexeySachkov Dec 8, 2022

sommerlukas Dec 8, 2022

AlexeySachkov Dec 8, 2022

sommerlukas Dec 9, 2022

AlexeySachkov Dec 8, 2022

AlexeySachkov Dec 8, 2022

AlexeySachkov Dec 8, 2022

AlexeySachkov Dec 8, 2022

AlexeySachkov Dec 8, 2022

AlexeySachkov Dec 8, 2022

sommerlukas commented Dec 9, 2022


		using namespace llvm;

		static FunctionType *createMaskedFunctionType(const BitVector &Mask,

		auto *FK = dyn_cast<MDString>(MDOp.get());
		assert(FK && "Kernel should be given by its name as MDString");

		auto *ConstantMD = dyn_cast<ConstantAsMetadata>(MD);
		assert(ConstantMD && "Metadata not constant");

		SmallVector<Value *> Indices;
		Indices.push_back(Builder.getInt32(0));

[SYCL][Fusion] JIT compiler kernel fusion passes #7661

[SYCL][Fusion] JIT compiler kernel fusion passes #7661

Conversation

sommerlukas commented Dec 6, 2022 • edited Loading

sommerlukas commented Dec 6, 2022

Naghasan left a comment

Choose a reason for hiding this comment

Naghasan commented Dec 8, 2022

steffenlarsen commented Dec 8, 2022

AlexeySachkov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sommerlukas commented Dec 9, 2022

sommerlukas commented Dec 6, 2022 •

edited

Loading