Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][Strategy] Use x86 dense schedules for arm_cpu #15470

Merged
merged 1 commit into from
Aug 7, 2023

Conversation

lhutton1
Copy link
Contributor

@lhutton1 lhutton1 commented Aug 3, 2023

Currently the fallback used when compiling a dense operation with targets such as llvm -device=arm_cpu is dense.generic. This results in very poor performance. Although #13775 meant that x86 schedules are used in cases where no strategy is provided by arm_cpu, the dense strategy is registered due to the existence of specialized schedules for arm_cpu e.g. a schedule for embedded devices. This commit ensures x86 schedules are used inplace of a generic schedule which yields much better performance.

The commit also follows the same approach for the dense.generic schedule as the x86 strategy. This will only be used when auto-scheduler is enabled.

A test has been added to check the intended schedules are picked when compiling with arm_cpu.

cc @ekalda @neildhickey

@tvm-bot
Copy link
Collaborator

tvm-bot commented Aug 3, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@github-actions github-actions bot requested a review from ekalda August 3, 2023 16:51
Copy link
Contributor

@ekalda ekalda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lhutton1, LGTM!

Currently the fallback used when compiling a dense operation with
targets such as `llvm -device=arm_cpu` is `dense.generic`. This results
very poor performance. Although apache#13775
meant that x86 schedules are used in cases where no strategy is provided
by arm_cpu, the dense strategy is registered due to the existance of
specialized schedules for arm_cpu e.g. a schedule for embedded devices.
This commit ensures x86 schedules are used inplace of a generic
schedule which yeilds much better performance.

The commit also follows the same approach for the `dense.generic`
schedule as the x86 strategy. This will only be used when autoscheduler
is enabled.

A test has been added to check the intended schedules are picked when
compiling with `arm_cpu`.

Change-Id: I8697f630d4acfab71a9626cf9e0dc3086987f163
Copy link
Contributor

@leandron leandron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! Merging this now, thanks @ekalda @lhutton1!

@leandron leandron merged commit ae45b04 into apache:main Aug 7, 2023
@lhutton1 lhutton1 deleted the use-x86-dense branch August 7, 2023 14:42
lhutton1 added a commit to lhutton1/tvm that referenced this pull request Aug 8, 2023
Similar to apache#15470, x86 schedules are
used in place of generic schedules to improve performance.

Since the pooling strategy does not use `OpStrategy`, mocking is used
to ensure the relevant `schedule_pool` function is called when lowing a
Relay pooling operation with respect to a given target.

Change-Id: I782fe00e29f9c9cf41b3405d33a82a79cd85a99b
lhutton1 added a commit that referenced this pull request Aug 9, 2023
Similar to #15470, x86 schedules are
used in place of generic schedules to improve performance.

Since the pooling strategy does not use `OpStrategy`, mocking is used
to ensure the relevant `schedule_pool` function is called when lowing a
Relay pooling operation with respect to a given target.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants