Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x64: Improve codegen for splats #6025

Merged
merged 1 commit into from
Mar 15, 2023

Conversation

alexcrichton
Copy link
Member

This commit goes through the lowerings for the CLIF splat instruction and improves the support for each operator. Many of these lowerings are mirrored from v8/SpiderMonkey and there are a number of improvements:

  • AVX2 v{p,}broadcast* instructions are added and used when available.
  • Float-based splats are much simpler and always a single-instruction
  • Integer-based splats don't insert into an uninit xmm value and instead start out with a movd to move into an xmm register. This thoeretically breaks dependencies with prior instructions since movd creates a fresh new value in the destination register.
  • Loads are now sunk into all of the instructions. A new extractor, sinkable_load_exact, was added to sink the i8/i16 loads.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen isle Related to the ISLE domain-specific language labels Mar 15, 2023
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:x64", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@fitzgen fitzgen added this pull request to the merge queue Mar 15, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Mar 15, 2023
This commit goes through the lowerings for the CLIF `splat` instruction
and improves the support for each operator. Many of these lowerings are
mirrored from v8/SpiderMonkey and there are a number of improvements:

* AVX2 `v{p,}broadcast*` instructions are added and used when available.
* Float-based splats are much simpler and always a single-instruction
* Integer-based splats don't insert into an uninit xmm value and instead
  start out with a `movd` to move into an `xmm` register. This
  thoeretically breaks dependencies with prior instructions since `movd`
  creates a fresh new value in the destination register.
* Loads are now sunk into all of the instructions. A new extractor,
  `sinkable_load_exact`, was added to sink the i8/i16 loads.
@alexcrichton alexcrichton enabled auto-merge March 15, 2023 21:01
@alexcrichton alexcrichton added this pull request to the merge queue Mar 15, 2023
Merged via the queue into bytecodealliance:main with commit d76f7ee Mar 15, 2023
@alexcrichton alexcrichton deleted the splat branch March 15, 2023 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants