-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Min/Max paths with AVX10.2 intrinsics #112535
Conversation
For some reason, I can't see any of the linked images in the PR description. |
Can you refresh and try again? Sometimes github does this. Whenever this happens to me, I just refresh the page and I am able to open the images. |
Doesn't help. For example, if I click on the first "image" link, under section 3, it is: https://github.com/user-attachments/assets/993fa944-06fd-4590-8fea-7abd5a8f3888 and I get a 404 "Page not found" error. I think this has been true for all of your PRs in the past, as well. I notice that your profile (https://github.com/khushal1996) doesn't show you as a member of the ".NET Platform" organization, like, for example, Deepak (https://github.com/DeepakRajendrakumaran) and Anthony (https://github.com/anthonycanino). Maybe there's a permissions issue due to that. |
It's the same for me (image link above does lead to 404) |
@BruceForstall @En3Tho |
Thanks; I can see them now. |
@tannergooding can you help review this PR? This PR uses the AVX10.2 instructions for min/max computations in JIT. |
CC. @dotnet/jit-contrib for secondary review |
@tannergooding @BruceForstall Can you help move this review forward? Looks like this has been stuck in approved state since a week. |
CC. @EgorBo, this is ready for secondary review |
Thanks @EgorBo |
Overview
This PR tracks optimizing x64 min/max floating point using the new saturating instructions introduced in AVX10.2. We are following the spec doc to add the new instructions and optimize the x64/x86 conversions.
Addresses #109081
Testing
Step 1: Run superpmi.exe on library mch files using JITLateDisasm to check if any errors occur. Use JITLateDisasm to check for a valid decoding of the byte stream through LLVM disasmbler
For this step, a new coredistools was used built from the LLVM repo. After running superpmi with JITLateDisasm, no decoding failures were detected. Please contact for getting access to the superpmi logs.
Step 2: Run superpmi and check for asmdiffs and assert errors.
Below is the summary of superpmi run
Since these diffs are expected, we can conclude that the superpmi run is successful
Step 3: Run the JIT test suite using a stable subset of tests on SDE
Results

Optimized ASM
Note: Below is a case by case basis of comparison between asm generated for
Avx512
vsAvx10.2
. TheAvx10v2
asm has been collected in sde.Case: Math.Min
** Test code**
Left Side is base (main, AVX512F) vs Right Side is diff (this PR, AVX10.2)

Case: Vector128.Min
** Test code**
Left Side is base (main, AVX512F) vs Right Side is diff (this PR, AVX10.2)

Case: Math.Max
** Test code**
Left Side is base (main, AVX512F) vs Right Side is diff (this PR, AVX10.2)

Case: Vector512.Max
** Test code**
Left Side is base (main, AVX512F) vs Right Side is diff (this PR, AVX10.2)

Case: Math.MinMagnitude
** Test code**
Left Side is base (main, AVX512F) vs Right Side is diff (this PR, AVX10.2)

Case: MathF.MinMagnitude
** Test code**
Left Side is base (main, AVX512F) vs Right Side is diff (this PR, AVX10.2)

Case: Math.MaxMagnitude
** Test code**
Left Side is base (main, AVX512F) vs Right Side is diff (this PR, AVX10.2)

Case: MathF.MaxMagnitude
** Test code**
Left Side is base (main, AVX512F) vs Right Side is diff (this PR, AVX10.2)
