Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3/6] Arm(R) Ethos(TM)-U NPU TIR compiler with conv2d support #8806

Merged
merged 7 commits into from
Sep 15, 2021

Conversation

mbaret
Copy link
Contributor

@mbaret mbaret commented Aug 20, 2021

This commit adds the lowering passes necessary to lower an NPU Relay module down to a TIR module that can be compiled for the NPU. Conv2d is supported as the first NPU operator. An intermediate TE stage between Relay and TIR allows support for scheduling the operators.

This PR implements P3 on our tracking issue for initial Ethos-U support in TVM: #8482. Our RFC can be found here: apache/tvm-rfcs#11.

@mbaret mbaret changed the title Arm(R) Ethos(TM)-U NPU TIR passes and TE for Conv2D [3/6] Arm(R) Ethos(TM)-U NPU TIR passes and TE for Conv2D Aug 20, 2021
@jroesch
Copy link
Member

jroesch commented Aug 20, 2021

Thanks for the contribution guys!

I know how painful of a request this is, but is there any way you guys could logically split this up into smaller sets of changes? I know I am one of the worst offenders of very large PRs but an 8k PR is very hard/nearly impossible to review well in one shot.

My hope is for us to build wider understanding of these big changes so we can spread the future review and maintenance burden across many people. For example @areusch has the most context but is on vacation for the next couple of weeks.

@mbaret
Copy link
Contributor Author

mbaret commented Aug 20, 2021

I appreciate this doesn't exactly trivialize the patch, but it's actually 'only' 4k lines because this is on top of #8795. I've got this as draft until that PR gets merged to give visibility, but I'm not expecting this to necessarily get reviewed until the dependent PR gets merged.

The particular problem we have splitting this PR up is that we need to introduce an operator alongside some of the compiler support so that we can test it. That's not to say it's impossible though, and if you still think 4k is too much I'm happy to explore some options to reduce it further.

@jroesch
Copy link
Member

jroesch commented Aug 20, 2021

Sounds good let's focus on #8795 first and then we can revisit this one afterwards!

@jroesch jroesch self-assigned this Aug 20, 2021
@mbaret mbaret force-pushed the ethosu_upstreaming_p3 branch from e1a3b39 to b612d71 Compare August 25, 2021 09:13
@manupak manupak force-pushed the ethosu_upstreaming_p3 branch from b612d71 to 03c137c Compare August 25, 2021 11:31
@mbaret mbaret force-pushed the ethosu_upstreaming_p3 branch from 03c137c to 5e848b6 Compare August 26, 2021 09:36
@manupak manupak force-pushed the ethosu_upstreaming_p3 branch from 5e848b6 to f62c96e Compare August 26, 2021 16:34
@mbaret mbaret force-pushed the ethosu_upstreaming_p3 branch from f62c96e to c2895f6 Compare September 9, 2021 09:01
@mbaret mbaret changed the title [3/6] Arm(R) Ethos(TM)-U NPU TIR passes and TE for Conv2D [3/6] Arm(R) Ethos(TM)-U NPU TIR compiler with conv2d support Sep 9, 2021
@mbaret mbaret marked this pull request as ready for review September 9, 2021 09:23
mbaret and others added 7 commits September 13, 2021 09:25
This commit adds the lowering passes necessary to lower
an NPU Relay module down to a TIR module that can be
compiled for the NPU. Conv2d is supported as the first
NPU operator. An intermediate TE stage between Relay and
TIR allows support for scheduling the operators.

Co-authored-by: Manupa Karunaratne <[email protected]>
Change-Id: I3741f9dd8bb5952590ff8c586f6b96e5c3a03795
*fixing tests

Change-Id: Id4a4c80f72ce29b98fc8b3954a1413c1c7fda500
Change-Id: Iaee06017bd125d3040ce42182c4ccdb80d7fc946
Change-Id: I81513f112a42b93cfdd3bcaf8e8852dd60ffe9e9
Change-Id: I6596b62ab56e4ca8b31ef08293686f53f38454d2
Change-Id: I0aaf83fe0204c0db435692e9b92dee6e9d6997fe
@Mousius Mousius force-pushed the ethosu_upstreaming_p3 branch from 532bd2c to 7f0e9bc Compare September 13, 2021 08:31
@manupak
Copy link
Contributor

manupak commented Sep 13, 2021

a friendly ping @mbs-octoml @jroesch , this is ready for review now.

@mbs-octoml
Copy link
Contributor

I've left 4 comments above.

Copy link
Member

@jroesch jroesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great looks good to me, left a bunch of really minor doc nits, I am happy if you guys want to do them in a follow up pass. Feel free to click the merge button.

# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name, unused-argument
"""The integration of Arm(R) Ethos(TM)-U NPU TIR compiler"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the below nits feel free to do in secondary PR, happy to land this first.

Nit: full sentence.

"""Lower a schedule to TIR for the Arm(R) Ethos(TM)-U NPU target.

The resulting TIR module will contain a single function
that comprises of a sequence of tir.extern_calls to NPU
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
that comprises of a sequence of tir.extern_calls to NPU
that consists of a sequence of tir.extern_calls to NPU

# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name, unused-argument
"""Extract information from the convolution operators in TIR."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain more "information" is vague.

# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name, unused-argument
"""Extract information from the DMA operators in TIR."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: same comment as above.

# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name, unused-argument
"""Different schedulers for Arm(R) Ethos(TM)-U NPU"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: same comment as above? "Different" wrt to what exactly?

mode,
):
"""This is a helper function to capture a list
of arguments to create Vela NpuResamplingMode object"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: punctuation in this file as well.

@@ -138,6 +139,12 @@ def round_up(a: int, b: int) -> int:
return ((a + b - 1) // b) * b


def get_accelerator_config():
"""Get the variant of the accelerator to compile for"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use some more clarification here as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sense when you know there are multiple sizes of NPU but curious what the returned data type is and where i might go to read more about the things that could be present here.

@@ -0,0 +1,234 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for us to combine this with the current code we factored out as part of the te_compiler.cc refactor? Ok to do in follow up.

# under the License.
import pytest

pytest.importorskip("ethosu.vela")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lunderberg is this the right way to do this?

Copy link
Contributor

@manupak manupak Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only dependency to run the test in this test source. Therefore, any platform that have dependency satisfied should be able to run the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this works, and allows the test suite to run regardless of the platform. The main issue I've seen with pytest.importorskip is that the tests in the file are entirely removed from the test report, rather than showing up as explicitly being skipped.

To have the tests show up in the test report, an individual test can be marked with tvm.testing.requires_package('ethosu.vela'), and the from tvm.relay.backend.contrib.ethosu import util statement moved into the body of the test function. However, this isn't the cleanest solution and I have some ideas on how to improve the ergonomics of skipping tests this way, so pytest.importorskip is a good use in the meantime.

# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name, unused-argument
"""The TIR passes to be run on Arm(R) Ethos(TM)-U NPU TIR Compiler"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: punctuation.

@mbaret
Copy link
Contributor Author

mbaret commented Sep 14, 2021

I've left 4 comments above.

@mbs-octoml Not sure if this is a GitHub bug on my end, but I can't see your comments. Could you check that you've submitted the review?


/*!
* \file relay/backend/contrib/ethosu/to_te_graph.cc
* \brief Lower a Relay function to a TE graph.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I check this file (and it's dependence on lower_call in compile_engine.py) is temporary scaffolding pending a suitable hook mechanism?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this PR has overtaken the relay_to_tir hook mechanism train.

  • Let's make abundantly clear in the comment this is a copy of relay::tec::ScheduleBuilder for the purpose of intercepting the scheduling and this is a temporary workaround. It would be a shame if a new contributor assumed this is the pattern for controlling scheduling.
  • And let's file a GitHub issue or point to the existing one for the rely_to_tir hook mechanism.

auto graph_node = make_object<TEGraphNode>();
for (Var param : prim_func->params) {
Array<tvm::te::Tensor> inputs;
if (const auto* ttype = param->checked_type().as<TensorTypeNode>()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can first relay::FlattenTupleType then just itr over the resut.

}

Array<te::Tensor> VisitExpr_(const CallNode* call_node) final {
static auto flower_call = tvm::runtime::Registry::Get("relay.backend.lower_call");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment that this API is pending removal in #8775.

# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name, unused-argument
"""Different schedulers for Arm(R) Ethos(TM)-U NPU"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to see the forest for the trees but a few paragraphs explaining the scheduling strategy would really help (along with a pointer to the RFC). The main points I think are:

  • make the dma with its layout xforms explicit (which in an ideal world could be done without having to replicate the inner conv2d TE defns?)
  • hoist constants
  • prepare for cascading
    Is that about right?

I'm not sure what we should say about how this plays with the new 'schedulable' TIR. We are prepared to port at some point? TVM overall is on the hook for preserving the 'legacy' scheduling API? Perhaps a GitHub issue is in order.

@mbs-octoml
Copy link
Contributor

D'oh - forgot to hit submit.

import ethosu.vela.api as vapi # type: ignore

import tvm
from tvm.relay.backend.contrib.ethosu import vela_api
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be nicer to call this one vela_api_utils and the true VELA api vela_api

@@ -308,3 +356,17 @@ def _calculate_hw_bias_scales(
hw_bias_scales = [_quantize_scale(bs) for bs in bias_scales]

return hw_bias_scales


def get_target_accel_type():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think earlier this was called variant--good to align terminology

@manupak
Copy link
Contributor

manupak commented Sep 15, 2021

Hi All,

In order to unblock reviews for the rest of the PRs (4/6 --> 6/6), we are committing to address the comments in P3a as mentioned here : #8482.

Having said that, can we get this in ? @mbaret

@leandron leandron merged commit 354019d into apache:main Sep 15, 2021
@leandron
Copy link
Contributor

Thanks @manupa-arm @mbaret @areusch @jroesch @mbs-octoml, this is merged now.

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
…#8806)

* Arm(R) Ethos(TM)-U NPU TIR compiler with conv2d support

This commit adds the lowering passes necessary to lower
an NPU Relay module down to a TIR module that can be
compiled for the NPU. Conv2d is supported as the first
NPU operator. An intermediate TE stage between Relay and
TIR allows support for scheduling the operators.

Co-authored-by: Manupa Karunaratne <[email protected]>

* Fix Conv2D TIR type sensitivity

Change-Id: I3741f9dd8bb5952590ff8c586f6b96e5c3a03795

* Arm(R) Ethos(TM)-U NPU TIR passes and TE for Conv2D

*fixing tests

Change-Id: Id4a4c80f72ce29b98fc8b3954a1413c1c7fda500

* Fix import guards for tests

Change-Id: Iaee06017bd125d3040ce42182c4ccdb80d7fc946

* Fix typing failures with ignores

Change-Id: I81513f112a42b93cfdd3bcaf8e8852dd60ffe9e9

* Remove unused import

Change-Id: I6596b62ab56e4ca8b31ef08293686f53f38454d2

* Reintroduce get_target_accel_type

Change-Id: I0aaf83fe0204c0db435692e9b92dee6e9d6997fe

Co-authored-by: Manupa Karunaratne <[email protected]>
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
…#8806)

* Arm(R) Ethos(TM)-U NPU TIR compiler with conv2d support

This commit adds the lowering passes necessary to lower
an NPU Relay module down to a TIR module that can be
compiled for the NPU. Conv2d is supported as the first
NPU operator. An intermediate TE stage between Relay and
TIR allows support for scheduling the operators.

Co-authored-by: Manupa Karunaratne <[email protected]>

* Fix Conv2D TIR type sensitivity

Change-Id: I3741f9dd8bb5952590ff8c586f6b96e5c3a03795

* Arm(R) Ethos(TM)-U NPU TIR passes and TE for Conv2D

*fixing tests

Change-Id: Id4a4c80f72ce29b98fc8b3954a1413c1c7fda500

* Fix import guards for tests

Change-Id: Iaee06017bd125d3040ce42182c4ccdb80d7fc946

* Fix typing failures with ignores

Change-Id: I81513f112a42b93cfdd3bcaf8e8852dd60ffe9e9

* Remove unused import

Change-Id: I6596b62ab56e4ca8b31ef08293686f53f38454d2

* Reintroduce get_target_accel_type

Change-Id: I0aaf83fe0204c0db435692e9b92dee6e9d6997fe

Co-authored-by: Manupa Karunaratne <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants