-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposals for Intel HW intrinsic API changes for vNext #25389
Comments
Current experience with writing very simple My proposal is to change API as follows: // Instead of:
public static Vector128<U> StaticCast<T, U>(Vector128<T> value) where T : struct where U : struct
// Use the following definitions:
public static Vector128<T> Cast<T>(Vector128<double> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<float> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<long> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<ulong> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<int> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<uint> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<short> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<ushort> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<byte> value) where T : struct
public static Vector128<T> Cast<T>(Vector128<sbyte> value) where T : struct The change allows for generic type argument inference and as a result it is not necessary to use any type arguments in most situation. For example instead of writing: /// <summary>
/// __m128i _mm_set1_epi8 (char a)
/// HELPER
/// </summary>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector128<byte> SetAllVector128(byte value)
{
Vector128<byte> vector1 = Sse.StaticCast<uint, byte>(ConvertScalarToVector128UInt32(value));
Vector128<ushort> tmpVector1 = Sse.StaticCast<byte, ushort>(UnpackLow(vector1, vector1));
Vector128<uint> tmpVector2 = Sse.StaticCast<ushort, uint>(UnpackLow(tmpVector1, tmpVector1));
return Sse.StaticCast<uint, byte>(Shuffle(tmpVector2, 0));
} we could write: /// <summary>
/// __m128i _mm_set1_epi8 (char a)
/// HELPER
/// </summary>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector128<byte> SetAllVector128(byte value)
{
Vector128<byte> vector1 = Cast(ConvertScalarToVector128UInt32(value));
Vector128<ushort> tmpVector1 = Cast(UnpackLow(vector1, vector1));
Vector128<uint> tmpVector2 = Cast(UnpackLow(tmpVector1, tmpVector1));
return Cast(Shuffle(tmpVector2, 0));
} And it is possible to use /// <summary>
/// __m128i _mm_set1_epi8 (char a)
/// HELPER
/// </summary>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector128<byte> SetAllVector128(byte value)
{
var vector1 = Cast<byte>(ConvertScalarToVector128UInt32(value));
var tmpVector1 = Cast<ushort>(UnpackLow(vector1, vector1));
var tmpVector2 = Cast<uint>(UnpackLow(tmpVector1, tmpVector1));
return Cast(Shuffle(tmpVector2, 0));
} This change is very easy to implement and perhaps we could do this before we ship v2.1 @CarolEidt @eerhardt @fiigii @jkotas @sdmaclea @tannergooding |
@CarolEidt @eerhardt @fiigii @tannergooding Ping. It is quite serious issue for devs who would be writing code using HW intrinsics and I would strongly recommend changing API before v2.1 ships. Fix is easy and fast. |
Personally, I prefer to always write explicit |
But it is not the way C# code is written by majority of coders. Most of the C# language features introduced over last years are aimed at reducing amount of code written and not at increasing it. |
It is also not technically impossible to support I would also like an opinion from @CarolEidt and/or @eerhardt since this is one of the next API to add to arm64. |
This seems like a reasonable proposal - i.e. the notion of having a cast with a single generic argument. However, we are already well past the point of finalizing 2.1 API changes, and really this kind of feedback is what we are looking to get before finalizing the API. |
I understand it but while writing just couple of functions it was just striking how often that operation is used and how tedious it is to write it. That kind of experience is difficult or impossible to predict without writing real world code what became possible in just last couple of days. If it is too late to change whole API perhaps we can implement |
@jkotas had suggested that |
Yes, I like this |
That's unfortunate. Reinterpret is a much more suitable name for this kind of stuff. No idea why would anyone veto that. |
We may have both: |
With the latter implemented as software wrapper of the |
It is due to different naming conventions and history of C++ vs. C# |
Naming conventions and history have nothing to do with this, they don't impose how to name a feature that's practically new to .NET/C#. |
There is an interesting C# language proposal dotnet/csharplang#1349 |
Agreed with this. We have to stop churning the API at some point so it can be shipped. We can change the API post-2.1, which is why we made it experimental for 2.1. This suggestion can go on that list. |
Does this need to be converted to a proper API proposal issue to move forward, or is it already on a review list somewhere? |
I have most of these on a review list for the next x86 HWIntrinsics review session. |
Looks like the specific issues here were addressed with the |
@saucecontrol Yes, it's true. Closing. |
Writing real code with already implemented HW intrinsics provides very good tests for their usability and quality of design. We start to create implementations which immediately provide direct test of quality of available API surface. I would like to propose to discuss issues and possible change proposals using this issue as an umbrella.
The text was updated successfully, but these errors were encountered: