-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intel HW Intrinsics classes should use inheritance #25953
Comments
Copying my response from the mentioned thread... The original reasons for splitting this, as I recall, were:
There were several discussions around how to make these APIs the "most usable"/"friendly". In the end, it was originally decided that some things (like software fallbacks) could be provided separately. I believe a shared wrapper API could be provided in the same way (or even as part of the same software fallback library). I do think that merging the APIs may make it "easier to use", but so would merging "AVX" and "AVX2". However, we can't merge "AVX" and "AVX2" because there are still plenty of CPUs and operating systems that only support one of them. Additionally, users can trivially "merge" these APIs themselves with a |
I prefer to merge SSE and SSE2 into one class. From the hardware product perspective, all the 64-bit x86 processors support SSE2 as the SIMD ISA bottom-line. I do not think any existing MSIL platform will consistently support the hardware released before 2001 (right?) From the user experience perspective, merging SSE and SSE2 makes a lot of 128-bit APIs generic and more convenient to use. One of my concern of merging SSE and SSE2 is that finding a good name of the class may be hard, |
Actually, I do not think this is a problem. The codegen of helper intrinsic does not need to be constrained to ISAs. Merging SSE and SSE2 can avoid this kind of "duplicated" APIs, such as |
It may be worth noting that other languages/runtimes that are also bringing up HWIntrinsic support are also keeping the ISAs separated. For example: |
It was main argument behind design of HW APIs. We have done split based on hardware support detection to give fine grained control over implementation.
I do not think it is true as after some additional data checks I have verified that latest Celeron N, Pentium N and Atom C series processors like: https://ark.intel.com/products/95592/Intel-Pentium-Processor-N4200-2M-Cache-up-to-2_5-GHz or do not support any type of SIMD instructions except for AES NI. |
No, I am pretty sure that Goldmont architecture has SSE -SSE4.2 instructions. |
Can you point us into any docs confirming SIMD support in Atom. Another problem is with Celeron N and Pentium N which are also known for lack of SIMD support. |
Please look for the docs of micro-architecture instead of processor product. For example, the newest ATOM/Celeron/Pentium should base on the architecture code-named Goldmont or Goldmont Plus. |
For usage it is way more convenient to have these merged to one class. |
No thanks 😀 To keep the both worlds happy, can we have |
Made a PR dotnet/corefx#29447 to show the effect of merging SSE and SSE2 classes, which may help the discussion here. |
@CarolEidt @tannergooding @terrajobst - I've updated the original comment to follow the API proposal process, and to capture the proposal to use inheritance in the X86 classes. All - Please take a look at the proposal and provide feedback. |
I like the proposal and think it works well. It allows easy access to the various methods from a given class while still keeping the IsSupported checks logical/etc. It also becomes easy to discover the relationship via source control (i.e. Is using Sse okay if Sse3.IsSupported returns true). Props to @terrajobst, who I believe was the original proposer. |
For the 3 may be slightly better as it is also applicable to ARM (cc. @sdmaclea) and may simplify some things (such as |
For the SetZeroVector128 issue, I prefer 2, but there is 4th solution.
Note: we (and C++ compilers) already use the similar approach for optimizing some other helpers (e.g., Avx.SetAllVector256). |
For |
Closing this as resolved, since we are now using inheritance to expose the intrinsics. Any other issues tracked here should be moved to their own new issue. |
A good portion of X86 ISAs all follow a hierarchy of support. If
SSE2
is supported, then implicitlySSE
is supported. Similarly, ifAVX2
is supported, thenAVX
is supported as well.We should design our HW Intrinsics classes in a similar fashion. This allows for simplicity in coding calling functions from 2 (or more) instruction sets:
This means we should have our classes inherit from each other to model this inheritance in the ISAs:
The
IsSupported
hierarchy checks are doc'd in https://software.intel.com/en-us/articles/intel-sdm, for example:The hierarchy is also checked in the .NET runtime: https://github.com/dotnet/coreclr/blob/6bf04a47badd74646e21e70f4e9267c71b7bfd08/src/vm/codeman.cpp#L1307.
There is a bit of an open question: What should
AVX
inherit from? Should it inherit fromSSE/SSE2/SSE4.1/SSE4.2
? The Intel docs don't seem to indicate any SSE is necessary to support AVX, but the .NET runtime will only support AVX if SSE2 is present.NOTE Taking this proposal would mean we will need to drop
static class
from all our classes, sincestatic class
cannot form an inheritance hierarchy.For the original problem of
SetZeroVector128
, we have the following options:SetZeroVector128Float
SetZeroVector128Double
,SetZeroVector128Int32
, etc.Sse2.SetZeroVector128<float>()
, which would match the behavior ofSse.SetZeroVector128()
.SetAllVector128
,SetVector128
, etc.Vector128<T>.Zero;
,new Vector128<T>();
, ordefault
.Original proposal:
This was discussed during the API design (dotnet/corefx#23489 (comment)) but has come under discussion again in the context of dotnet/coreclr#17691.
A related issue is, if the APIs remain separate, how to deal with an API, e.g.
SetZeroVector128
that provides support for float vectors in SSE, but supports the full range of generic types in SSE2.The text was updated successfully, but these errors were encountered: