Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono] Pass Vector64, Vector128, and Vector256 in SIMD registers whenever possible #60068

Open
4 of 14 tasks
imhameed opened this issue Oct 6, 2021 · 9 comments
Open
4 of 14 tasks
Labels
area-Codegen-meta-mono runtime-mono specific to the Mono runtime
Milestone

Comments

@imhameed
Copy link
Contributor

imhameed commented Oct 6, 2021

In particular this should work for FFI calls, so we should respect the target platform's common calling convention(s).

An similar example is 128-bit vectors for amd64.

Handles input arguments:

  • LLVM
    • Arm64
    • Amd64
  • mini
    • Arm64
    • Amd64
  • interpreter

Handles return argument:

  • LLVM
    • Arm64
    • Amd64
  • mini
    • Arm64
    • Amd64
  • interpreter
@imhameed imhameed added this to the 7.0.0 milestone Oct 6, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Oct 6, 2021
@dotnet dotnet deleted a comment from dotnet-issue-labeler bot Oct 6, 2021
@imhameed imhameed added area-Codegen-LLVM-mono runtime-mono specific to the Mono runtime and removed untriaged New issue has not been triaged by the area owner labels Oct 6, 2021
@lambdageek
Copy link
Member

Would allow us to run these tests https://github.com/dotnet/runtime/tree/main/src/tests/Interop/PInvoke/Generics and implement mono/mono#17868

A related issue #9578 is tracking the CoreCLR work to unblock passing Vector128/Vector256 by value to P/Invokes. It's currently blocked (with a bad error message about the types not being blittable) due to CoreCLR incorrectly marshalling vector return values on Windows x64.

A related issue is for CoreCLR to implement support for __vectorcall on Windows #8300

@lambdageek
Copy link
Member

A specific use-case is to be able to call Apple platform APIs that have some arguments that have types with a __ext_vector_type attribute. (see the Mono issue linked above)

@lambdageek
Copy link
Member

We would need this across the board: LLVM, mini and interp should support it.

@fanyang-mono
Copy link
Member

fanyang-mono commented Apr 20, 2022

When using LLVM as codegen backend, LLVM is able to does the work of sending the data to SIMD register before the operation. Here is an example from x64

c# code

using System;
using System.Runtime.Intrinsics;
using System.Runtime.Intrinsics.Arm;
using System.Runtime.CompilerServices;
using System.Numerics;

namespace HelloWorld
{
    internal class Program
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        private static Vector128<int> test(Vector128<int> a, Vector128<int> b, Vector128<int> c)
        {
            return Vector128.Min(a,b);
        }

        private static void Main(string[] args)
        {
            Vector128<int> A = Vector128.Create((int)3.1);
            Vector128<int> B = Vector128.Create((int)5.7);
            Vector128<int> C = Vector128.Create((int)50);

            var result = test(A, B, C);
            Console.WriteLine(result);
        }
    }
}

When LLVM is enabled:

*** ASM for HelloWorld.Program:test (System.Runtime.Intrinsics.Vector128`1<int>,System.Runtime.Intrinsics.Vector128`1<int>,System.Runtime.Intrinsics.Vector128`1<int>) ***
/var/folders/9q/30znkg553fb1vt7_qnx2v0040000gn/T/.44S0Ml:
(__TEXT,__text) section
loWorld_Program_test__System_Runtime_Intrinsics_Vector128_1_int__System_Runtime_Intrinsics_Vector128_1_int__System_Runtime_Intrinsics_Vector128_1_int__:
0000000000000000	vmovq	%rdx, %xmm0
0000000000000005	vmovq	%rsi, %xmm1
000000000000000a	vpunpcklqdq	%xmm0, %xmm1, %xmm0
000000000000000e	vmovq	%r8, %xmm1
0000000000000013	vmovq	%rcx, %xmm2
0000000000000018	vpunpcklqdq	%xmm1, %xmm2, %xmm1
000000000000001c	vpminsd	%xmm1, %xmm0, %xmm0
0000000000000021	vmovdqu	%xmm0, (%rdi)
0000000000000025	retq

Thus, this work actually only apply to mini and interp.

@lambdageek
Copy link
Member

lambdageek commented Apr 21, 2022

@fanyang-mono
In your example see how the arguments are actually passed in %rdx and %rsi and then moved into %xmm0 and %xmm1:

0000000000000000	vmovq	%rdx, %xmm0
0000000000000005	vmovq	%rsi, %xmm1

What this issue is about is two things:

  1. making managed-to-managed calls utilize a calling convention where those Vector128<int> arguments oculd be passed in SIMD registers directly (so those movs aren't needed - the arguments would already be in %xmm0 and `%xmm1).
  2. The same thing but for PInvokes (ie managed calling unmanaged). Which is also what [Xamarin.iOS] Add runtime support for SIMD types mono/mono#17868 is about.

These two problems could be tackled separately: we could just do the managed-to-managed calls first. But we would still need to do something for LLVM (as your example shows)

@fanyang-mono
Copy link
Member

Gotcha!

@fanyang-mono
Copy link
Member

Nice to have, moving to 8.0.0

@fanyang-mono
Copy link
Member

This is a tracking issue. Moving it to .NET9.

@lewing
Copy link
Member

lewing commented Feb 9, 2024

cc @steveisok @vargaz

@steveisok steveisok modified the milestones: 9.0.0, Future Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Codegen-meta-mono runtime-mono specific to the Mono runtime
Projects
None yet
Development

No branches or pull requests

5 participants