Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding CPUID for AVX10.2 #109302

Merged
merged 5 commits into from
Nov 21, 2024
Merged

Conversation

khushal1996
Copy link
Contributor

@khushal1996 khushal1996 commented Oct 28, 2024

Overview

This PR tracks addition of CPUID for AVX10.2 to runtime. We are following the spec doc to add the CPUID for AVX10.2. Following things have been modified as a part of this PR

  1. Adding a new ISA InstructionSet_Avx10v2 and InstructionSet_Avx10v2_V512
  2. Adding a new environment variable EnableAvx10v2 to JIT.
  3. Handle enabling/disabling of AVX10 ISA in a careful manner. EnableAvx10v1 will take priority over EnableAvx10v2 i.e. if Avx10v1 is disabled, Avx10v2 will be disabled too.
  4. Remove checks for CPUID for AVX10v1_V128 since bit 17 will imply V128 and V256 support i.e. minimum support on AVX10.
    image

Testing Correctness of Logic

Since we don't have the DMR hardware available right now, the testing for this PR was done on internal SDE. Following scenarios were tested on internal SDE

image

  1. Add debug prints to check the value of cpufeatures after setting all the ISA flags.
    image
    image
    image

  2. Enable AVX10v1 and AVX10v2 by default and do the same.
    image

  3. Disable Avx10v1 and check if it disables Avx10v2. (it should disable AVX10v2)
    image

  4. Disable Avx10v2 and verify that Avx10v1 stays enabled.
    image

Testing JIT test suite


The testing was done using the Rex2 unit testing framework. All the changes made for testing CPUID changes are present in this repo

A brief overview -->

  • Unit tests ran to make sure no existing test is breaking
  • SDE embedded in test framework
  • Test ran with Intel internal SDE using dmr machine

Step 1: Run all tests without SDE
image

Step 2: Running all tests on the current branch with sde -dmr and DOTNET_EnableAVX10v2=0
image

Step 3: Running all tests on the current branch with sde -dmr
image

Testing ASMDIFFS


Using jit-diff method, this testing step compares assembly between the main and the current branch and makes sure that we are not changing any existing scenarios.

Note - we cannot use superpmi here since we are adding a new CPUID and hence the jiteeversionguid has changed which would lead to wrong results from superpmi.

image

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 28, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Oct 28, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@khushal1996 khushal1996 marked this pull request as ready for review October 29, 2024 16:15
@khushal1996
Copy link
Contributor Author

@tannergooding Can you please help review this PR?

@tannergooding
Copy link
Member

This has conflicts that need resolving now that #104637 is merged

@khushal1996
Copy link
Contributor Author

@tannergooding PR has been rebased and updated.

@tannergooding
Copy link
Member

I think we're missing some handling in:

  • src/coreclr/tools/Common/Compiler/InstructionSetSupport.cs
  • src/coreclr/tools/Common/InstructionSetHelpers.cs
  • src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/HardwareIntrinsicHelpers.Aot.cs

Since we don't have managed ISAs yet we don't need to handle:

  • src/libraries/System.Private.CoreLib/src/ILLink/ILLink.Substitutions.NoX86Intrinsics.xml
  • src/mono/System.Private.CoreLib/src/ILLink/ILLink.Substitutions.Intrinsics.x86.xml
  • src/mono/mono/mini/simd-intrinsics.c

@khushal1996
Copy link
Contributor Author

I think we're missing some handling in:

  • src/coreclr/tools/Common/Compiler/InstructionSetSupport.cs
  • src/coreclr/tools/Common/InstructionSetHelpers.cs
  • src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/HardwareIntrinsicHelpers.Aot.cs

Since we don't have managed ISAs yet we don't need to handle:

  • src/libraries/System.Private.CoreLib/src/ILLink/ILLink.Substitutions.NoX86Intrinsics.xml
  • src/mono/System.Private.CoreLib/src/ILLink/ILLink.Substitutions.Intrinsics.x86.xml
  • src/mono/mono/mini/simd-intrinsics.c

Thanks for pointing me to this. Looks like this handling was added later to handle AVX10v1 support. I have updated the handling for AVX10v2 accordingly.

@tannergooding
Copy link
Member

CC, @dotnet/jit-contrib for secondary review

@@ -375,6 +375,10 @@ public bool ComputeInstructionSetFlags(int maxVectorTBitWidth,

if (_supportedInstructionSets.Contains("avx10v1"))
_supportedInstructionSets.Add("avx10v1_v512");

// Having AVX10V1-V512 enabled, means having AVX10V2-V512 enabled as well.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backwards? It seems that if avx10v2_v512 is supported then avx10v1_v512 is also supported, but not vice-versa.

Copy link
Member

@tannergooding tannergooding Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one does look incorrect. I believe it should be if (_supportedInstructionSets.Contains("avx10v2")) so it matches the AVX10v1 handling just above.

Which is to say that if Avx10v2 is supported and AVX512 is supported, then Avx10v2_V512 is supported.

Copy link
Contributor Author

@khushal1996 khushal1996 Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had deliberately kept this as is. By implication

implication ,X86 ,AVX10v2 ,AVX10v1
implication ,X86 ,AVX10v2_V512 ,AVX10v1_V512

And what bruce is saying is correct. If we have AVX10v2_V512, then we would have AVX10v1_V512. Please correct me if I am understanding it wrong. From the design, we cannot have AVX10v1 and have AVX10v2. But to maintain a standard code, I will just change it to what we have for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed this change.

Copy link
Member

@tannergooding tannergooding Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is doing functionally implication ,X86 ,AVX10v1_V512, AVX10v2_V512 where-as the actual implication is implication ,X86 ,AVX10v2_V512, AVX10v1_V512. This would result in a machine/target with only AVX10v1 enabling AVX10v2 support.

The way the logic in this function is meant to work is to cover implications that cannot be tracked trivially by the hierarchy, such as enabling AVX10v2.V512 when AVX10v2 + AVX512 is supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. thanks for pointing this out @BruceForstall @tannergooding

Comment on lines +1247 to +1270
if (resultflags.HasInstructionSet(InstructionSet.X86_AVX10v1))
resultflags.AddInstructionSet(InstructionSet.X86_AVX10v2);
if (resultflags.HasInstructionSet(InstructionSet.X86_AVX10v1_V512))
resultflags.AddInstructionSet(InstructionSet.X86_AVX10v2_V512);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems odd: it says that if AVX10v1 is supported, then AVX10v2 is supported. Is that correct? I assume the OPPOSITE is true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of ExpandInstructionSetByReverseImplicationHelper, which exists to allow NAOT to enable any dependent ISAs if the user just specifies avx10v2 on the command line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opposite is true. @khushal1996 please correct this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anthonycanino this part is correct (and is notably autogenerated by the tool). Its part of ExpandInstructionSetByReverseImplicationHelper

@BruceForstall
Copy link
Member

@khushal1996 A change that added another ISA just went in, meaning you need to resolve the merge conflicts in this PR.

@BruceForstall BruceForstall merged commit 5345dc5 into dotnet:main Nov 21, 2024
159 of 162 checks passed
@khushal1996 khushal1996 deleted the kcm-avx102-cpuid branch November 21, 2024 15:09
@khushal1996 khushal1996 restored the kcm-avx102-cpuid branch November 21, 2024 15:09
mikelle-rogers pushed a commit to mikelle-rogers/runtime that referenced this pull request Dec 10, 2024
* Add CPUID for AVX10.2

* Handle AVX10.2 ISAs in missing places

* Remove APX_X64 since it is unused

* remove log file and change handling for AVX10v2_V512

* resolve merge errors
@github-actions github-actions bot locked and limited conversation to collaborators Dec 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants