-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3/6] Arm(R) Ethos(TM)-U NPU TIR compiler with conv2d support #8806
Conversation
Thanks for the contribution guys! I know how painful of a request this is, but is there any way you guys could logically split this up into smaller sets of changes? I know I am one of the worst offenders of very large PRs but an 8k PR is very hard/nearly impossible to review well in one shot. My hope is for us to build wider understanding of these big changes so we can spread the future review and maintenance burden across many people. For example @areusch has the most context but is on vacation for the next couple of weeks. |
I appreciate this doesn't exactly trivialize the patch, but it's actually 'only' 4k lines because this is on top of #8795. I've got this as draft until that PR gets merged to give visibility, but I'm not expecting this to necessarily get reviewed until the dependent PR gets merged. The particular problem we have splitting this PR up is that we need to introduce an operator alongside some of the compiler support so that we can test it. That's not to say it's impossible though, and if you still think 4k is too much I'm happy to explore some options to reduce it further. |
Sounds good let's focus on #8795 first and then we can revisit this one afterwards! |
e1a3b39
to
b612d71
Compare
b612d71
to
03c137c
Compare
03c137c
to
5e848b6
Compare
5e848b6
to
f62c96e
Compare
f62c96e
to
c2895f6
Compare
c2895f6
to
9923074
Compare
This commit adds the lowering passes necessary to lower an NPU Relay module down to a TIR module that can be compiled for the NPU. Conv2d is supported as the first NPU operator. An intermediate TE stage between Relay and TIR allows support for scheduling the operators. Co-authored-by: Manupa Karunaratne <[email protected]>
Change-Id: I3741f9dd8bb5952590ff8c586f6b96e5c3a03795
*fixing tests Change-Id: Id4a4c80f72ce29b98fc8b3954a1413c1c7fda500
Change-Id: Iaee06017bd125d3040ce42182c4ccdb80d7fc946
Change-Id: I81513f112a42b93cfdd3bcaf8e8852dd60ffe9e9
Change-Id: I6596b62ab56e4ca8b31ef08293686f53f38454d2
Change-Id: I0aaf83fe0204c0db435692e9b92dee6e9d6997fe
532bd2c
to
7f0e9bc
Compare
a friendly ping @mbs-octoml @jroesch , this is ready for review now. |
I've left 4 comments above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great looks good to me, left a bunch of really minor doc nits, I am happy if you guys want to do them in a follow up pass. Feel free to click the merge button.
# specific language governing permissions and limitations | ||
# under the License. | ||
# pylint: disable=invalid-name, unused-argument | ||
"""The integration of Arm(R) Ethos(TM)-U NPU TIR compiler""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all the below nits feel free to do in secondary PR, happy to land this first.
Nit: full sentence.
"""Lower a schedule to TIR for the Arm(R) Ethos(TM)-U NPU target. | ||
|
||
The resulting TIR module will contain a single function | ||
that comprises of a sequence of tir.extern_calls to NPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that comprises of a sequence of tir.extern_calls to NPU | |
that consists of a sequence of tir.extern_calls to NPU |
# specific language governing permissions and limitations | ||
# under the License. | ||
# pylint: disable=invalid-name, unused-argument | ||
"""Extract information from the convolution operators in TIR.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain more "information" is vague.
# specific language governing permissions and limitations | ||
# under the License. | ||
# pylint: disable=invalid-name, unused-argument | ||
"""Extract information from the DMA operators in TIR.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: same comment as above.
# specific language governing permissions and limitations | ||
# under the License. | ||
# pylint: disable=invalid-name, unused-argument | ||
"""Different schedulers for Arm(R) Ethos(TM)-U NPU""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: same comment as above? "Different" wrt to what exactly?
mode, | ||
): | ||
"""This is a helper function to capture a list | ||
of arguments to create Vela NpuResamplingMode object""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: punctuation in this file as well.
@@ -138,6 +139,12 @@ def round_up(a: int, b: int) -> int: | |||
return ((a + b - 1) // b) * b | |||
|
|||
|
|||
def get_accelerator_config(): | |||
"""Get the variant of the accelerator to compile for""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use some more clarification here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sense when you know there are multiple sizes of NPU but curious what the returned data type is and where i might go to read more about the things that could be present here.
@@ -0,0 +1,234 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for us to combine this with the current code we factored out as part of the te_compiler.cc
refactor? Ok to do in follow up.
# under the License. | ||
import pytest | ||
|
||
pytest.importorskip("ethosu.vela") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lunderberg is this the right way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only dependency to run the test in this test source. Therefore, any platform that have dependency satisfied should be able to run the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this works, and allows the test suite to run regardless of the platform. The main issue I've seen with pytest.importorskip
is that the tests in the file are entirely removed from the test report, rather than showing up as explicitly being skipped.
To have the tests show up in the test report, an individual test can be marked with tvm.testing.requires_package('ethosu.vela')
, and the from tvm.relay.backend.contrib.ethosu import util
statement moved into the body of the test function. However, this isn't the cleanest solution and I have some ideas on how to improve the ergonomics of skipping tests this way, so pytest.importorskip
is a good use in the meantime.
# specific language governing permissions and limitations | ||
# under the License. | ||
# pylint: disable=invalid-name, unused-argument | ||
"""The TIR passes to be run on Arm(R) Ethos(TM)-U NPU TIR Compiler""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: punctuation.
@mbs-octoml Not sure if this is a GitHub bug on my end, but I can't see your comments. Could you check that you've submitted the review? |
|
||
/*! | ||
* \file relay/backend/contrib/ethosu/to_te_graph.cc | ||
* \brief Lower a Relay function to a TE graph. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I check this file (and it's dependence on lower_call in compile_engine.py) is temporary scaffolding pending a suitable hook mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this PR has overtaken the relay_to_tir hook mechanism train.
- Let's make abundantly clear in the comment this is a copy of relay::tec::ScheduleBuilder for the purpose of intercepting the scheduling and this is a temporary workaround. It would be a shame if a new contributor assumed this is the pattern for controlling scheduling.
- And let's file a GitHub issue or point to the existing one for the rely_to_tir hook mechanism.
auto graph_node = make_object<TEGraphNode>(); | ||
for (Var param : prim_func->params) { | ||
Array<tvm::te::Tensor> inputs; | ||
if (const auto* ttype = param->checked_type().as<TensorTypeNode>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can first relay::FlattenTupleType then just itr over the resut.
} | ||
|
||
Array<te::Tensor> VisitExpr_(const CallNode* call_node) final { | ||
static auto flower_call = tvm::runtime::Registry::Get("relay.backend.lower_call"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment that this API is pending removal in #8775.
# specific language governing permissions and limitations | ||
# under the License. | ||
# pylint: disable=invalid-name, unused-argument | ||
"""Different schedulers for Arm(R) Ethos(TM)-U NPU""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to see the forest for the trees but a few paragraphs explaining the scheduling strategy would really help (along with a pointer to the RFC). The main points I think are:
- make the dma with its layout xforms explicit (which in an ideal world could be done without having to replicate the inner conv2d TE defns?)
- hoist constants
- prepare for cascading
Is that about right?
I'm not sure what we should say about how this plays with the new 'schedulable' TIR. We are prepared to port at some point? TVM overall is on the hook for preserving the 'legacy' scheduling API? Perhaps a GitHub issue is in order.
D'oh - forgot to hit submit. |
import ethosu.vela.api as vapi # type: ignore | ||
|
||
import tvm | ||
from tvm.relay.backend.contrib.ethosu import vela_api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be nicer to call this one vela_api_utils and the true VELA api vela_api
@@ -308,3 +356,17 @@ def _calculate_hw_bias_scales( | |||
hw_bias_scales = [_quantize_scale(bs) for bs in bias_scales] | |||
|
|||
return hw_bias_scales | |||
|
|||
|
|||
def get_target_accel_type(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i think earlier this was called variant--good to align terminology
Thanks @manupa-arm @mbaret @areusch @jroesch @mbs-octoml, this is merged now. |
…#8806) * Arm(R) Ethos(TM)-U NPU TIR compiler with conv2d support This commit adds the lowering passes necessary to lower an NPU Relay module down to a TIR module that can be compiled for the NPU. Conv2d is supported as the first NPU operator. An intermediate TE stage between Relay and TIR allows support for scheduling the operators. Co-authored-by: Manupa Karunaratne <[email protected]> * Fix Conv2D TIR type sensitivity Change-Id: I3741f9dd8bb5952590ff8c586f6b96e5c3a03795 * Arm(R) Ethos(TM)-U NPU TIR passes and TE for Conv2D *fixing tests Change-Id: Id4a4c80f72ce29b98fc8b3954a1413c1c7fda500 * Fix import guards for tests Change-Id: Iaee06017bd125d3040ce42182c4ccdb80d7fc946 * Fix typing failures with ignores Change-Id: I81513f112a42b93cfdd3bcaf8e8852dd60ffe9e9 * Remove unused import Change-Id: I6596b62ab56e4ca8b31ef08293686f53f38454d2 * Reintroduce get_target_accel_type Change-Id: I0aaf83fe0204c0db435692e9b92dee6e9d6997fe Co-authored-by: Manupa Karunaratne <[email protected]>
…#8806) * Arm(R) Ethos(TM)-U NPU TIR compiler with conv2d support This commit adds the lowering passes necessary to lower an NPU Relay module down to a TIR module that can be compiled for the NPU. Conv2d is supported as the first NPU operator. An intermediate TE stage between Relay and TIR allows support for scheduling the operators. Co-authored-by: Manupa Karunaratne <[email protected]> * Fix Conv2D TIR type sensitivity Change-Id: I3741f9dd8bb5952590ff8c586f6b96e5c3a03795 * Arm(R) Ethos(TM)-U NPU TIR passes and TE for Conv2D *fixing tests Change-Id: Id4a4c80f72ce29b98fc8b3954a1413c1c7fda500 * Fix import guards for tests Change-Id: Iaee06017bd125d3040ce42182c4ccdb80d7fc946 * Fix typing failures with ignores Change-Id: I81513f112a42b93cfdd3bcaf8e8852dd60ffe9e9 * Remove unused import Change-Id: I6596b62ab56e4ca8b31ef08293686f53f38454d2 * Reintroduce get_target_accel_type Change-Id: I0aaf83fe0204c0db435692e9b92dee6e9d6997fe Co-authored-by: Manupa Karunaratne <[email protected]>
This commit adds the lowering passes necessary to lower an NPU Relay module down to a TIR module that can be compiled for the NPU. Conv2d is supported as the first NPU operator. An intermediate TE stage between Relay and TIR allows support for scheduling the operators.
This PR implements P3 on our tracking issue for initial Ethos-U support in TVM: #8482. Our RFC can be found here: apache/tvm-rfcs#11.