Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
On top of #3451.
There are two ways to vectorize
Ifloatarithmem, for example for
add` operation, we can eitherThis PR supports both options, although the more efficient one instruction cannot currently be emitted because we don't have the alignment guarantees. This can be enabled in the future if we propagate sufficient informaton about bigarrays for example, but may need runtime changes to ensure alignment.
To emit SIMD instructions with memory arguments, this PR adds a bunch of machinery to SIMD selection and handling to keep track of the addressing mode. In particular, it adds
ISimd_mem
to specific operations. The rest is trivial propagation of the new variant to all the right places. This is in a separate commit, which I am happy to split out into a separate PR if the design choices don't seem obvious. The new instructions are lightly tested manually (except for binary emitter), but are never emitted even when the vectorizer is enabled.