[QNN][Relay][Topi] Add qnn.dense with weight layout #13854

ibsidorenko · 2023-01-27T11:32:01Z

This commit adds new Relay operation qnn.contrib_dense_pack that supports different weights layout (nn.dense and qnn.dense do not support this attribute). This new operation is full analog of nn.contrib_dense_pack operation but in QNN space.

With this PR, current QNN Dense can achieve ~10x performance gain on Hexagon target without QNN canonicalization (through the use of vrmpy intrinsic).

Also, this PR includes slight performance improvement for qnn.mul (without QNN canonicalization).

tvm-bot · 2023-01-27T11:32:05Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @shingjan _{See #10317 for details}

_{Generated by tvm-bot}

This commit imroves performance of qnn.mul operation without QNN canonicalization.

TejashShah · 2023-01-30T18:05:53Z

cc @quic-sanirudh @jverma-quic @farshidsp

quic-sanirudh · 2023-01-30T18:25:22Z

cc @quic-sanirudh @jverma-quic @farshidsp

This is cool, thanks a lot. I went through this a little already earlier, but I'll go through the PR more closely.

quic-sanirudh

Looks good to me apart from the one question about schedule.

Also, one small thought and I might be wrong here, but I see that the nn version of this op is called nn.contrib_dense_pack, so should we name this qnn.contrib_dense_pack?

quic-sanirudh · 2023-01-31T15:48:17Z

python/tvm/topi/hexagon/qnn/nn.py

+    sch: Schedule
+        The computation schedule for the op.
+    """
+    return default_schedule(outs)


is there a plan to add vrmpy tensorized schedule separately, or is that going to be dependent on metaschedule?

I have such plans, but with low priority. The most interesting is work of MetaScheduler for now.

ibsidorenko · 2023-01-31T16:22:47Z

Looks good to me apart from the one question about schedule.

Also, one small thought and I might be wrong here, but I see that the nn version of this op is called nn.contrib_dense_pack, so should we name this qnn.contrib_dense_pack?

Yes, I can rename this operator to be aligned with its analog nn.contrib_dense_pack

jverma-quic · 2023-01-31T18:21:49Z

python/tvm/relay/qnn/op/legalizations.py

+    # Shift kernel if necessary.
+    if kernel_dtype == "int8":
+        # Compute (QA + 128) and (zp_a + 128)
+        kernel, kernel_zero_point = _shift(kernel, kernel_zero_point, "uint8")
+


Why do we want to convert kernel to int8? This would result into non-zero value zero-point and introduced additional computation which could have been avoided if kernel stayed in int8.

Main goal of conversion int8 --> uint8 is to enable using of faster vrmpy u8u8i32 intrinsic instead of vrmpy u8i8i32 for qnn.dense/qnn.conv2d
Yes, you're absolutely right. We have some overhead on such conversion. That's why I have commented out this code and skip i8 -> u8 conversion for weights. Right now I do not see any performance improvement due to overhead.
I will enable this code if I will manage to get better performance.

jverma-quic · 2023-02-01T16:39:20Z

tests/python/relay/test_pass_qnn_legalize.py

@@ -296,7 +297,98 @@ def _get_mod(data_dtype, kernel_dtype):
        assert "cast" in legalized_mod.astext() and "qnn" in legalized_mod.astext()


+def test_qnn_legalize_qnn_conv2d_non_scalar_qnn_params():
+    """
+    Test QNN legalization for qnn.dense op for Hexagon target when kernel zero point and kernel


typo?: qnn.dense->qnn.conv2d

Done, fixed

jverma-quic · 2023-02-01T16:41:55Z

tests/python/relay/test_pass_qnn_legalize.py

+def test_qnn_legalize_qnn_conv2d_non_scalar_qnn_params():
+    """
+    Test QNN legalization for qnn.dense op for Hexagon target when kernel zero point and kernel
+    scale are not scalars.


Could you please elaborate what you mean by kernel zero-point and scale not being scalars?

So, by "not scalar" I mean constant vector of scalars. For example:
relay.const([val1, val2 ... valN])
This is more interesting case for testing because on legalization we do padding of weights. And dimension of weights, vector of scale/zero point and 'channels' attribute should be aligned.

This commit adds new Relay operation "qnn.contrib_dense_pack" that supports different weights layout (nn.dense and qnn.dense do not support this attribute). This new operation is full analog of "nn.contrib_dense_pack" operation but in QNN space.

* [Hexagon][QNN] Improve performance of qnn.mul This commit imroves performance of qnn.mul operation without QNN canonicalization. * [QNN][Relay][Topi] Add qnn.dense with weight layout This commit adds new Relay operation "qnn.contrib_dense_pack" that supports different weights layout (nn.dense and qnn.dense do not support this attribute). This new operation is full analog of "nn.contrib_dense_pack" operation but in QNN space.

[Hexagon][QNN] Improve performance of qnn.mul

fc98e9c

This commit imroves performance of qnn.mul operation without QNN canonicalization.

ibsidorenko force-pushed the wo-qnn-perf-improvement branch 2 times, most recently from 116a60d to f55a0c9 Compare January 30, 2023 08:57

quic-sanirudh approved these changes Jan 31, 2023

View reviewed changes

jverma-quic reviewed Jan 31, 2023

View reviewed changes

ibsidorenko force-pushed the wo-qnn-perf-improvement branch from f55a0c9 to 7e45dda Compare February 1, 2023 12:16

jverma-quic reviewed Feb 1, 2023

View reviewed changes

ibsidorenko force-pushed the wo-qnn-perf-improvement branch from 7e45dda to 15e85e5 Compare February 1, 2023 20:23

jverma-quic approved these changes Feb 1, 2023

View reviewed changes

echuraev approved these changes Feb 2, 2023

View reviewed changes

echuraev merged commit 37e1a68 into apache:main Feb 2, 2023

ibsidorenko deleted the wo-qnn-perf-improvement branch March 29, 2023 06:23

ysh329 mentioned this pull request Apr 17, 2023

[Release] v0.12.0 Release Candidate Notes #14645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QNN][Relay][Topi] Add qnn.dense with weight layout #13854

[QNN][Relay][Topi] Add qnn.dense with weight layout #13854

ibsidorenko commented Jan 27, 2023 •

edited

Loading

tvm-bot commented Jan 27, 2023 •

edited

Loading

TejashShah commented Jan 30, 2023 •

edited

Loading

quic-sanirudh commented Jan 30, 2023

quic-sanirudh left a comment

quic-sanirudh Jan 31, 2023

ibsidorenko Jan 31, 2023

ibsidorenko commented Jan 31, 2023

jverma-quic Jan 31, 2023 •

edited

Loading

ibsidorenko Feb 1, 2023

jverma-quic Feb 1, 2023

ibsidorenko Feb 1, 2023

jverma-quic Feb 1, 2023

ibsidorenko Feb 1, 2023

[QNN][Relay][Topi] Add qnn.dense with weight layout #13854

[QNN][Relay][Topi] Add qnn.dense with weight layout #13854

Conversation

ibsidorenko commented Jan 27, 2023 • edited Loading

tvm-bot commented Jan 27, 2023 • edited Loading

TejashShah commented Jan 30, 2023 • edited Loading

quic-sanirudh commented Jan 30, 2023

quic-sanirudh left a comment

Choose a reason for hiding this comment

quic-sanirudh Jan 31, 2023

Choose a reason for hiding this comment

ibsidorenko Jan 31, 2023

Choose a reason for hiding this comment

ibsidorenko commented Jan 31, 2023

jverma-quic Jan 31, 2023 • edited Loading

Choose a reason for hiding this comment

ibsidorenko Feb 1, 2023

Choose a reason for hiding this comment

jverma-quic Feb 1, 2023

Choose a reason for hiding this comment

ibsidorenko Feb 1, 2023

Choose a reason for hiding this comment

jverma-quic Feb 1, 2023

Choose a reason for hiding this comment

ibsidorenko Feb 1, 2023

Choose a reason for hiding this comment

ibsidorenko commented Jan 27, 2023 •

edited

Loading

tvm-bot commented Jan 27, 2023 •

edited

Loading

TejashShah commented Jan 30, 2023 •

edited

Loading

jverma-quic Jan 31, 2023 •

edited

Loading