Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Codegen][Metal] Support metal warp-level primitive #15401

Merged

Conversation

MasterJH5574
Copy link
Contributor

This PR introduces the warp-level shuffle primitives used in Metal Shading Language, and uses them in the implementation of allreduce lowering.

The introduced primitives are:

  • simd_shuffle,
  • simd_shuffle_up,
  • simd_shuffle_down.

See section 6.9.2 of https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf for details.

The correctness are validated by test_allreduce_cuda with the backend changed to Metal. Given we do not have Metal CI tests, the correctness is checked only locally.

Given the Metal shuffle primitives do not support (or need) masking, the pass LowerThreadAllreduce is updated to support such backend which does not have masks. One unit test for metal is added to ensure that no mask is used.

@tvm-bot
Copy link
Collaborator

tvm-bot commented Jul 25, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@MasterJH5574
Copy link
Contributor Author

MasterJH5574 commented Jul 25, 2023

Depending on #15399 to fix the allreduce correctness issues first.

This PR introduces the warp-level shuffle primitives used in Metal
Shading Language, and uses them in the implementation of allreduce
lowering.

The introduced primitives are:
* `simd_shuffle`,
* `simd_shuffle_up`,
* `simd_shuffle_down`.

See section 6.9.2 of https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf
for details.

The correctness are validated by `test_allreduce_cuda` with the backend
changed to Metal. Given we do not have Metal CI tests, the correctness
is checked only locally.

Given the Metal shuffle primitives do not support (or need) masking,
the pass LowerThreadAllreduce is updated to support such backend
which does not have masks. One unit test for metal is added to ensure
that no mask is used.
@MasterJH5574 MasterJH5574 force-pushed the tvm-dev/2023-07-25-metal-warp-level branch from 7b1a687 to fb416ef Compare July 25, 2023 16:50
@echuraev echuraev merged commit 22ec541 into apache:main Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants