[MetaSchedule][Hexagon] Add MultiLevelTilingHexagon to schedule asyn… #13721

nverke · 2023-01-07T01:10:11Z

…c pipelines that utilize DMA

This will create schedules that utilize async dma pipelines on hexagon. Currently this has some limitations due to the need to have dma lowering be contiguous memory copies. This PR is reliant on #13720 and will not build until that is merged into main.

Best perf that I found was around 5 GFLOPS for a the conv2d in the test.

These changes were made in collaboration with @masahi

@adstraw @csullivan @Lunderberg

tvm-bot · 2023-01-07T01:10:14Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @ibsidorenko _{See #10317 for details}

_{Generated by tvm-bot}

nverke · 2023-01-13T17:00:28Z

@tvm-bot rerun

adstraw · 2023-01-19T19:21:41Z

include/tvm/meta_schedule/schedule_rule.h

+   * names of tensor intrinsics, must be registered via
+   * TensorIntrin.register(...) beforehand
+   * \param structure The tiling structure. Recommended:
+   * - 'SRSRS' on Hexagon


You might map SRSRS to the layout NCHWc in the comment

This should be applicable outside of schedules that have input of NCHWc but are you suggesting adding that to help better map this to something?

It just took me a minute to make the connection between the Hexagon layout (NCHWc) and the tiling structure (SRSRS) which led me to suggest a comment to clarify. The key thing for my understanding when I originally reviewed the code was to connect R=Reduction to the C=Channel axes.

adstraw · 2023-01-19T19:21:55Z

python/tvm/meta_schedule/schedule_rule/multi_level_tiling.py

+        TensorIntrin.register(...) beforehand
+    structure : str
+        The tiling structure. Recommended:
+        - 'SRSRS' on Hexagon


You might map SRSRS to the layout NCHWc in the comment

adstraw · 2023-01-19T19:25:33Z

include/tvm/meta_schedule/schedule_rule.h

+   * \param tile_binds For each level of tiles, which thread axis it is bound to. These are not
+   * supported on hexagon.
+   * \param max_innermost_factor The maximum size of the innermost factor. NullOpt means no limit
+   * \param vector_load_lens The length of vector lane in vectorized cooperative fetching.


Confused as to why we have both the vector load length and the max innermost factor. These seem redundant. Aren't we always going to vectorize over the innermost loop? And, if vectorization is enabled, won't we use vector_load_lens for the size of the innermost loop? Also a "max" for max_innermost_factor seems strange. E.g. if the user says the max is 8 I imagine that trying 7 is a bad choice whereas a list of [2, 4, 8] are likely good choices. Anyway... I see that this API is inherited from MultiLevelTilingInitCommon so I won't harp on it. Just an observation.

These have two separate uses, vector_load_lens is used for vector loads outside of the tiling loops where as max_innermost_factor is for loop tiling. As for the naming and usage I am not 100% sure.

Like I said, just an observation given you are inheriting this API.

adstraw · 2023-01-19T19:29:03Z

python/tvm/tir/tensor_intrin/hexagon.py

@@ -68,12 +69,12 @@ def sync_dma_load_impl(a: T.handle, c: T.handle) -> None:
    return sync_dma_load_desc, sync_dma_load_impl


-def generate_dot_product_32x4_u8u8i32(mem_scope="global"):
+def generate_dot_product_32x4_u8u8i32(read_mem_scope="global", write_mem_scope="global"):


Why are the defaults here "global" instead of "global.vtcm"?

Because typically these are used without vtcm

adstraw · 2023-01-19T19:45:03Z

src/meta_schedule/schedule_rule/multi_level_tiling_hexagon.cc

+  if (!use_software_pipeline) {
+    return {state};
+  }
+  // The current config is not suitable for software pipelining.


Update comment to indicate what r_indices_ represents (reduction axes) since it's not a member of MultiLevelTilingHexagonNode class. And also why we need at least 2 of them as I can't quite figure out what that's the case.

Added a comment on this!

adstraw · 2023-01-19T19:46:33Z

src/meta_schedule/schedule_rule/multi_level_tiling_hexagon.cc

+      reduction_length *= extent->value;
+    }
+  }
+  if (reduction_length <= 1) {


Curious use of <= 1 here as opposed to == 1. Do we support zero or negative extents?

It should not be possible but since extents with value 0 can happen then the length could end up being 0.

adstraw · 2023-01-19T19:48:23Z

src/meta_schedule/schedule_rule/multi_level_tiling_hexagon.cc

+  Array<Integer> software_pipeline_stage;
+  Array<Integer> software_pipeline_order;
+  Array<Integer> software_pipeline_async_stages;
+  if (cache_read_count == 2) {


This looks correct but this notation is difficult to read for many folks. Some comments might help.

Tried to explain as best I could!

adstraw · 2023-01-19T20:06:44Z

src/meta_schedule/schedule_rule/multi_level_tiling_hexagon.cc

+  sch->Annotate(fused, tir::attr::software_pipeline_async_stages, software_pipeline_async_stages);
+
+  // TODO(nverke): Add support for nested async pipelines.
+  // TODO(nverke): Add support for async cache writes.


Is the lack of cache write support here due to the issue in the InjectSWPipeline pass where there is no "wait" on cache write stage?

No its just from limiting complexity of this to start.

adstraw · 2023-01-19T20:15:20Z

tests/python/contrib/test_hexagon/test_conv2d_async.py

+    executor = relay.backend.Executor("graph", {"link-params": True})
+    mod = mod.with_attr("executor", executor)
+
+    use_async = True


This seems strange especially with no else case below

removed this!

adstraw · 2023-01-19T22:49:43Z

src/meta_schedule/schedule_rule/multi_level_tiling_hexagon.cc

+
+std::vector<State> MultiLevelTilingHexagonNode::ApplySubRules(std::vector<State> states) {
+  states = MultiLevelTilingWithIntrinNode::ApplySubRules(states);
+  states = SubRule(std::move(states), [&](State state) { return AddSoftwarePipeline(state); });


Seems that MultiLevelTilingHexagon could be its own schedule rule as adds the ability to AddSoftwarePipeline but otherwise defers to MultiLevelTilingWithIntrin. I understand the main reason for the inheritance here is to control the order of the application of the schedule rules, that tiling must precede software pipelining. Wondering there is some other solution besides inheritance to solve this problem.

I am not sure I follow. If we don't inherit we would have to copy all of the logic needed over from MLT with intrin. But we don't want to add software pipelines to all usages of MLT with intrin as this will try to optimize for hexagon.

My comment may be naïve. My read of the code was that the only connection between MultiLevelTilingWithIntrin and AddSoftwarePipeline schedule rules was the order in which they were applied. And that we were using inheritance to ensure that order was maintained. And that there was no further value to the inheritance beyond maintaining that order. If any of that is wrong, let me know and we can scrap this comment. If I am correct, perhaps there is a way to create two separate passes ... one for software pipeline and one for multi-level tiling.

… pipelines that utilize DMA

nverke force-pushed the mlt-hexagon branch from c63d7dd to 3ccd4b1 Compare January 12, 2023 17:11

nverke force-pushed the mlt-hexagon branch 2 times, most recently from c408d94 to 29b17f6 Compare January 17, 2023 21:52

nverke changed the title ~~[MetaScheduler][Hexagon] Add MultiLevelTilingHexagon to schedule asyn…~~ [MetaSchedule][Hexagon] Add MultiLevelTilingHexagon to schedule asyn… Jan 18, 2023

nverke force-pushed the mlt-hexagon branch from c62e074 to bf084b5 Compare January 18, 2023 22:51

adstraw suggested changes Jan 19, 2023

View reviewed changes

nverke added 2 commits February 17, 2023 10:30

[MetaSchedule][Hexagon] Add MultiLevelTilingHexagon to schedule async…

84dc7b9

… pipelines that utilize DMA

Remove unnecessary statement visitor.

375555d

nverke force-pushed the mlt-hexagon branch from bf084b5 to 375555d Compare February 17, 2023 18:31

nverke added 2 commits February 17, 2023 10:50

Add pr updates around comments.

8143321

lint

7abbd6d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MetaSchedule][Hexagon] Add MultiLevelTilingHexagon to schedule asyn… #13721

[MetaSchedule][Hexagon] Add MultiLevelTilingHexagon to schedule asyn… #13721

nverke commented Jan 7, 2023 •

edited

Loading

tvm-bot commented Jan 7, 2023 •

edited

Loading

nverke commented Jan 13, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023 •

edited

Loading

adstraw Feb 21, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Feb 21, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Jan 19, 2023

nverke Feb 17, 2023

adstraw Feb 21, 2023

[MetaSchedule][Hexagon] Add MultiLevelTilingHexagon to schedule asyn… #13721

Are you sure you want to change the base?

[MetaSchedule][Hexagon] Add MultiLevelTilingHexagon to schedule asyn… #13721

Conversation

nverke commented Jan 7, 2023 • edited Loading

tvm-bot commented Jan 7, 2023 • edited Loading

nverke commented Jan 13, 2023

Choose a reason for hiding this comment

nverke Feb 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nverke commented Jan 7, 2023 •

edited

Loading

tvm-bot commented Jan 7, 2023 •

edited

Loading

nverke Feb 17, 2023 •

edited

Loading