[CUTLASS] Conv2d activation fusion, part 2: Sigmoid fp16, SiLU and HardSwish #9795

masahi · 2021-12-22T20:02:26Z

Now dependent PRs in the cutlass repo have been merged, so we can enable more fusions. They were used in the benchmark in #9746

@comaniac @Laurawly

masahi · 2021-12-22T20:04:49Z

python/tvm/relay/frontend/pytorch.py


-            if mode == "constant":
+            if not non_zero_found:
+                return data


This is a minor optimization but it non-trivially helped performance on the DETR model. @comaniac

Hmm interesting. I didn't notice that we may have pad ops that actually pad nothing.

masahi · 2021-12-22T20:07:00Z

src/relay/op/dyn/tensor/transform.cc

@@ -467,6 +467,9 @@ bool StridedSliceRel(const Array<Type>& types, int num_inputs, const Attrs& attr
  int64_t num_axis = dshape.size();

  const auto* begin = types[1].as<TensorTypeNode>();
+  if (begin == nullptr) {
+    return false;
+  }


This and the change below in src/relay/op/tensor/transform.cc are the fix for the type inference issue mentioned in "Known issues" section of #9746

No test is added because it is hard to reproduce on a simple test case and the change is trivial.

comaniac

LGTM

comaniac · 2021-12-22T23:49:48Z

python/tvm/relay/frontend/pytorch.py


-            if mode == "constant":
+            if not non_zero_found:
+                return data


Hmm interesting. I didn't notice that we may have pad ops that actually pad nothing.

…rdSwish (apache#9795) * [Torch] do not pad if pad widths are all zero * silu fusion supported * adding hardswish support * support fast_math sigmoid op * fixed type inference for yolov5 + silu fusion * use include_non_call_ops=False in AnnotateTarget * update cutlass * revert change in build.py * simplify codegen * lint

masahi added 9 commits December 23, 2021 04:50

[Torch] do not pad if pad widths are all zero

e0fb117

silu fusion supported

f23d38d

adding hardswish support

225b96e

support fast_math sigmoid op

f56f9bb

fixed type inference for yolov5 + silu fusion

cbe61c9

use include_non_call_ops=False in AnnotateTarget

fb57f87

update cutlass

de3817d

revert change in build.py

a5d4dd9

simplify codegen

d47c100

masahi requested review from anijain2305, areusch, comaniac, Huyuwei, icemelon, jroesch, junrushao, jwfromm, kazum, manupak, MarisaKirisame, mbaret, mbrookhart, merrymercy, siju-samuel, slyubomirsky, srkreddy1238, tqchen, trevor-m, vinx13 and wweic as code owners December 22, 2021 20:02

masahi requested review from yzhliu, zhiics and ZihengJiang as code owners December 22, 2021 20:02

masahi commented Dec 22, 2021

View reviewed changes

lint

18e0736

masahi force-pushed the cutlass-conv2d-fuse2 branch from 56e0e95 to 18e0736 Compare December 22, 2021 20:26

comaniac approved these changes Dec 22, 2021

View reviewed changes

masahi merged commit 1afcf36 into apache:main Dec 23, 2021

driazati mentioned this pull request Jul 14, 2022

TVM v0.9.0.rc0 Release Candidate Notes #12102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUTLASS] Conv2d activation fusion, part 2: Sigmoid fp16, SiLU and HardSwish #9795

[CUTLASS] Conv2d activation fusion, part 2: Sigmoid fp16, SiLU and HardSwish #9795

masahi commented Dec 22, 2021 •

edited

Loading

masahi Dec 22, 2021

comaniac Dec 22, 2021

masahi Dec 22, 2021

comaniac left a comment

comaniac Dec 22, 2021

[CUTLASS] Conv2d activation fusion, part 2: Sigmoid fp16, SiLU and HardSwish #9795

[CUTLASS] Conv2d activation fusion, part 2: Sigmoid fp16, SiLU and HardSwish #9795

Conversation

masahi commented Dec 22, 2021 • edited Loading

masahi Dec 22, 2021

Choose a reason for hiding this comment

comaniac Dec 22, 2021

Choose a reason for hiding this comment

masahi Dec 22, 2021

Choose a reason for hiding this comment

comaniac left a comment

Choose a reason for hiding this comment

comaniac Dec 22, 2021

Choose a reason for hiding this comment

masahi commented Dec 22, 2021 •

edited

Loading