Rewrite how Matrix3x2 and Matrix4x4 are implemented #80091

tannergooding · 2023-01-02T05:26:12Z

This rewrites Matrix3x2 and Matrix4x4 so that it can better take advantage of SIMD acceleration and JIT features like promotion.

In both cases, the entire implementation was moved to a private nested type Impl and relies on the fact that bitcasting between these same sized value types is a "no-op" to the JIT. These new types have 3x Vector2 or 4x Vector4 fields, respectively, and is in contrast to the 6x float and 16x float fields the main types have. This switch allows the JIT to perform field promotion and therefore also enregistration of the underlying bytes. We would have ideally done this more directly, but since the main types expose fields publicly, that would have been a breaking change.

Since we don't currently have "vectorcall" on Windows and since we can't compose the underlying types using Vector128<float> without it being an ABI break for interop scenarios, Impl takes the large matrix types via in to help reduce copying. However, returns are still returned by value since the field promotion allows things to get properly optimized. When inlining occurs, the JIT is able to see through the value being passed as in and avoid the value being "address taken".

Where there were trivially recognizable operations possible, such as using a vectorized operation rather than multiple scalar operations, those were done. Other methods, which would require more complex changes, were left "as is" and as a future exercise.

ghost · 2023-01-02T05:26:25Z

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

This rewrites Matrix3x2 and Matrix4x4 so that it can better take advantage of SIMD acceleration and JIT features like promotion.

In both cases, the entire implementation was moved to a private nested type Impl. These new types have 3x Vector2 or 4x Vector4 fields, respectively, and is in contrast to the 6x float and 16x float fields the main types have. This switch allows the JIT to perform field promotion and therefore also enregistration of the underlying bytes. We would have ideally done this more directly, but since the main types expose fields publicly, that would have been a breaking change.

Since we don't currently have "vectorcall" on Windows and since we can't compose the underlying types using Vector128<float> without it being an ABI break for interop scenarios, Impl takes the large matrix types via in to help reduce copying. However, returns are still returned by value since the field promotion allows things to get properly optimized. When inlining occurs, the JIT is able to see through the value being passed as in and avoid the value being "address taken".

Where there were trivially recognizable operations possible, such as using a vectorized operation rather than multiple scalar operations, those were done. Other methods, which would require more complex changes, were left "as is" and as a future exercise.

Author:	tannergooding
Assignees:	tannergooding
Labels:	`area-System.Numerics`
Milestone:	-

tannergooding · 2023-01-02T05:27:28Z

Perf_Matrix3x2

Several methods are basically the same as before, but many are 8-10x faster


BenchmarkDotNet=v0.13.2.1940-nightly, OS=Windows 11 (10.0.22621.963)
AMD Ryzen 9 7950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=8.0.100-alpha.1.22622.3
  [Host]     : .NET 7.0.1 (7.0.122.56804), X64 RyuJIT AVX2
  Job-GBVKRB : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-KKUZSC : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250.0000 ms  MaxIterationCount=20  
MinIterationCount=15  WarmupCount=1

Method	Job	Toolchain	Mean	Error	StdDev	Median	Min	Max	Ratio	RatioSD	Allocated	Alloc Ratio
CreateFromScalars	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	4.9651 ns	0.0188 ns	0.0147 ns	4.9587 ns	4.9465 ns	4.9910 ns	0.98	0.00	-	NA
CreateFromScalars	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.0683 ns	0.0318 ns	0.0265 ns	5.0598 ns	5.0346 ns	5.1345 ns	1.00	0.00	-	NA

IdentityBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.2891 ns	0.0284 ns	0.0316 ns	0.2983 ns	0.2343 ns	0.3497 ns	0.94	0.09	-	NA
IdentityBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3011 ns	0.0121 ns	0.0113 ns	0.3028 ns	0.2747 ns	0.3192 ns	1.00	0.00	-	NA

IsIdentityBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3741 ns	0.0266 ns	0.0307 ns	0.3825 ns	0.3133 ns	0.4104 ns	0.09	0.01	-	NA
IsIdentityBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	3.9968 ns	0.0137 ns	0.0129 ns	3.9961 ns	3.9788 ns	4.0166 ns	1.00	0.00	-	NA

AddOperatorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.6576 ns	0.0365 ns	0.0420 ns	0.6685 ns	0.5977 ns	0.7150 ns	0.11	0.01	-	NA
AddOperatorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.0585 ns	0.0506 ns	0.0423 ns	6.0605 ns	5.9802 ns	6.1341 ns	1.00	0.00	-	NA

EqualityOperatorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.7158 ns	0.0330 ns	0.0324 ns	0.7049 ns	0.6855 ns	0.7670 ns	0.20	0.01	-	NA
EqualityOperatorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	3.6199 ns	0.0207 ns	0.0194 ns	3.6185 ns	3.5913 ns	3.6543 ns	1.00	0.00	-	NA

InequalityOperatorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.7187 ns	0.0266 ns	0.0235 ns	0.7167 ns	0.6893 ns	0.7617 ns	0.20	0.01	-	NA
InequalityOperatorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	3.6450 ns	0.0096 ns	0.0085 ns	3.6445 ns	3.6324 ns	3.6631 ns	1.00	0.00	-	NA

MultiplyByMatrixOperatorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	2.3747 ns	0.0120 ns	0.0106 ns	2.3724 ns	2.3640 ns	2.4006 ns	0.34	0.00	-	NA
MultiplyByMatrixOperatorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.9667 ns	0.0521 ns	0.0487 ns	6.9563 ns	6.9077 ns	7.0599 ns	1.00	0.00	-	NA

MultiplyByScalarOperatorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.2555 ns	0.0040 ns	0.0031 ns	0.2546 ns	0.2523 ns	0.2633 ns	0.05	0.00	-	NA
MultiplyByScalarOperatorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.5317 ns	0.0486 ns	0.0455 ns	5.5529 ns	5.4740 ns	5.6181 ns	1.00	0.00	-	NA

SubtractOperatorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.5917 ns	0.0072 ns	0.0056 ns	0.5906 ns	0.5852 ns	0.6020 ns	0.10	0.00	-	NA
SubtractOperatorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.9587 ns	0.0128 ns	0.0120 ns	5.9574 ns	5.9373 ns	5.9763 ns	1.00	0.00	-	NA

NegationOperatorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.4079 ns	0.0263 ns	0.0246 ns	0.4105 ns	0.3618 ns	0.4388 ns	0.07	0.00	-	NA
NegationOperatorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.5423 ns	0.0276 ns	0.0258 ns	5.5319 ns	5.4976 ns	5.5907 ns	1.00	0.00	-	NA

AddBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.5821 ns	0.0197 ns	0.0175 ns	0.5724 ns	0.5637 ns	0.6102 ns	0.09	0.00	-	NA
AddBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.2009 ns	0.0324 ns	0.0288 ns	6.1908 ns	6.1627 ns	6.2701 ns	1.00	0.00	-	NA

CreateRotationBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.9322 ns	0.1702 ns	0.1748 ns	7.8408 ns	7.7737 ns	8.2705 ns	0.71	0.02	-	NA
CreateRotationBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	11.2423 ns	0.0156 ns	0.0138 ns	11.2432 ns	11.2043 ns	11.2617 ns	1.00	0.00	-	NA

CreateRotationWithCenterBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.9738 ns	0.0046 ns	0.0036 ns	7.9753 ns	7.9659 ns	7.9777 ns	0.76	0.00	-	NA
CreateRotationWithCenterBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	10.4790 ns	0.0317 ns	0.0281 ns	10.4836 ns	10.4146 ns	10.5255 ns	1.00	0.00	-	NA

CreateScaleFromScalarXYBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.0963 ns	0.0138 ns	0.0123 ns	0.0966 ns	0.0716 ns	0.1197 ns	0.02	0.00	-	NA
CreateScaleFromScalarXYBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.3071 ns	0.0248 ns	0.0220 ns	5.3034 ns	5.2809 ns	5.3510 ns	1.00	0.00	-	NA

CreateScaleFromScalarWithCenterBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.2611 ns	0.0144 ns	0.0135 ns	0.2645 ns	0.2281 ns	0.2732 ns	0.05	0.00	-	NA
CreateScaleFromScalarWithCenterBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.6407 ns	0.0177 ns	0.0157 ns	5.6417 ns	5.6099 ns	5.6649 ns	1.00	0.00	-	NA

CreateScaleFromScalarXYWithCenterBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3034 ns	0.0177 ns	0.0166 ns	0.3031 ns	0.2741 ns	0.3322 ns	0.05	0.00	-	NA
CreateScaleFromScalarXYWithCenterBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.8240 ns	0.0416 ns	0.0347 ns	5.8347 ns	5.7714 ns	5.8838 ns	1.00	0.00	-	NA

CreateScaleFromScalarBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.0934 ns	0.0131 ns	0.0116 ns	0.0968 ns	0.0738 ns	0.1107 ns	0.02	0.00	-	NA
CreateScaleFromScalarBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.6067 ns	0.0112 ns	0.0099 ns	5.6042 ns	5.5965 ns	5.6319 ns	1.00	0.00	-	NA

CreateScaleFromVectorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.1487 ns	0.0065 ns	0.0060 ns	0.1504 ns	0.1276 ns	0.1527 ns	0.03	0.00	-	NA
CreateScaleFromVectorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.3182 ns	0.0216 ns	0.0202 ns	5.3114 ns	5.2835 ns	5.3544 ns	1.00	0.00	-	NA

CreateScaleFromVectorWithCenterBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3713 ns	0.0227 ns	0.0212 ns	0.3646 ns	0.3441 ns	0.4132 ns	0.06	0.00	-	NA
CreateScaleFromVectorWithCenterBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.7516 ns	0.0482 ns	0.0451 ns	5.7571 ns	5.6696 ns	5.8235 ns	1.00	0.00	-	NA

CreateSkewFromScalarXYBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.1449 ns	0.0136 ns	0.0127 ns	0.1491 ns	0.1222 ns	0.1617 ns	0.03	0.00	-	NA
CreateSkewFromScalarXYBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.5708 ns	0.0890 ns	0.0789 ns	5.6035 ns	5.4177 ns	5.6340 ns	1.00	0.00	-	NA

CreateSkewFromScalarXYWithCenterBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.5619 ns	0.0368 ns	0.0362 ns	0.5638 ns	0.5055 ns	0.6408 ns	0.10	0.01	-	NA
CreateSkewFromScalarXYWithCenterBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.5219 ns	0.1283 ns	0.1318 ns	5.4495 ns	5.3906 ns	5.7763 ns	1.00	0.00	-	NA

CreateTranslationFromVectorBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.0680 ns	0.0070 ns	0.0062 ns	0.0667 ns	0.0520 ns	0.0784 ns	0.08	0.01	-	NA
CreateTranslationFromVectorBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.8681 ns	0.0058 ns	0.0045 ns	0.8702 ns	0.8589 ns	0.8724 ns	1.00	0.00	-	NA

CreateTranslationFromScalarXY	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.0633 ns	0.0124 ns	0.0116 ns	0.0683 ns	0.0407 ns	0.0731 ns	0.02	0.00	-	NA
CreateTranslationFromScalarXY	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	3.5658 ns	0.0451 ns	0.0400 ns	3.5594 ns	3.5153 ns	3.6444 ns	1.00	0.00	-	NA

EqualsBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.9536 ns	0.0051 ns	0.0045 ns	1.9544 ns	1.9469 ns	1.9604 ns	0.29	0.00	-	NA
EqualsBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.8393 ns	0.0417 ns	0.0369 ns	6.8332 ns	6.7947 ns	6.9234 ns	1.00	0.00	-	NA

GetDeterminantBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.4608 ns	0.0152 ns	0.0135 ns	0.4528 ns	0.4494 ns	0.4835 ns	0.47	0.01	-	NA
GetDeterminantBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.9748 ns	0.0251 ns	0.0235 ns	0.9705 ns	0.9475 ns	1.0232 ns	1.00	0.00	-	NA

InvertBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.6303 ns	0.0202 ns	0.0189 ns	0.6205 ns	0.6015 ns	0.6647 ns	0.14	0.00	-	NA
InvertBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	4.3285 ns	0.0488 ns	0.0407 ns	4.3244 ns	4.2588 ns	4.4085 ns	1.00	0.00	-	NA

LerpBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.6910 ns	0.0207 ns	0.0194 ns	0.6787 ns	0.6764 ns	0.7248 ns	0.10	0.00	-	NA
LerpBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.6542 ns	0.0332 ns	0.0259 ns	6.6561 ns	6.5989 ns	6.6922 ns	1.00	0.00	-	NA

MultiplyByMatrixBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	2.3594 ns	0.0069 ns	0.0057 ns	2.3565 ns	2.3529 ns	2.3682 ns	0.34	0.00	-	NA
MultiplyByMatrixBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.9973 ns	0.0787 ns	0.0737 ns	6.9724 ns	6.9030 ns	7.1282 ns	1.00	0.00	-	NA

MultiplyByScalarBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.2768 ns	0.0128 ns	0.0107 ns	0.2741 ns	0.2672 ns	0.2984 ns	0.05	0.00	-	NA
MultiplyByScalarBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.6565 ns	0.0540 ns	0.0505 ns	5.6290 ns	5.5987 ns	5.7278 ns	1.00	0.00	-	NA

NegateBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3687 ns	0.0223 ns	0.0209 ns	0.3787 ns	0.3385 ns	0.4110 ns	0.07	0.00	-	NA
NegateBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.5659 ns	0.0241 ns	0.0201 ns	5.5643 ns	5.5339 ns	5.6090 ns	1.00	0.00	-	NA

SubtractBenchmark	Job-BFZBNC	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.5683 ns	0.0057 ns	0.0045 ns	0.5684 ns	0.5622 ns	0.5763 ns	0.10	0.00	-	NA
SubtractBenchmark	Job-RPNVZF	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.9662 ns	0.0187 ns	0.0175 ns	5.9652 ns	5.9386 ns	6.0038 ns	1.00	0.00	-	NA

tannergooding · 2023-01-02T05:38:05Z

Perf_Matrix4x4

Similarly to Matrix3x2, several methods are basically the same as before, but many are 8-10x faster

BenchmarkDotNet=v0.13.2.1940-nightly, OS=Windows 11 (10.0.22621.963)
AMD Ryzen 9 7950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=8.0.100-alpha.1.22622.3
  [Host]     : .NET 7.0.1 (7.0.122.56804), X64 RyuJIT AVX2
  Job-RTZTKA : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-SYBZSC : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250.0000 ms  MaxIterationCount=20  
MinIterationCount=15  WarmupCount=1

Method	Job	Toolchain	Mean	Error	StdDev	Median	Min	Max	Ratio	RatioSD	Allocated	Alloc Ratio
CreateFromMatrix3x2	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.7186 ns	0.1231 ns	0.1091 ns	5.7254 ns	5.5578 ns	5.9677 ns	0.77	0.02	-	NA
CreateFromMatrix3x2	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.4013 ns	0.1556 ns	0.1598 ns	7.4386 ns	7.1388 ns	7.7430 ns	1.00	0.00	-	NA

CreateFromScalars	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.4273 ns	0.1417 ns	0.1325 ns	7.4040 ns	7.2543 ns	7.6568 ns	1.22	0.03	-	NA
CreateFromScalars	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.0820 ns	0.1304 ns	0.1220 ns	6.0254 ns	5.9401 ns	6.2634 ns	1.00	0.00	-	NA

IdentityBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3045 ns	0.0118 ns	0.0104 ns	0.2998 ns	0.2959 ns	0.3265 ns	0.77	0.03	-	NA
IdentityBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3947 ns	0.0093 ns	0.0078 ns	0.3972 ns	0.3768 ns	0.4038 ns	1.00	0.00	-	NA

IsIdentityBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.5253 ns	0.0289 ns	0.0297 ns	0.5226 ns	0.4875 ns	0.5953 ns	0.10	0.01	-	NA
IsIdentityBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.0435 ns	0.0478 ns	0.0399 ns	5.0448 ns	4.9797 ns	5.1387 ns	1.00	0.00	-	NA

TranslationBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.5664 ns	0.0379 ns	0.0355 ns	1.5690 ns	1.5266 ns	1.6379 ns	1.11	0.03	-	NA
TranslationBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.4127 ns	0.0271 ns	0.0240 ns	1.4006 ns	1.3928 ns	1.4701 ns	1.00	0.00	-	NA

AddOperatorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.3684 ns	0.0170 ns	0.0151 ns	1.3648 ns	1.3495 ns	1.4050 ns	0.16	0.00	-	NA
AddOperatorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	8.6257 ns	0.0755 ns	0.0706 ns	8.6375 ns	8.5019 ns	8.7320 ns	1.00	0.00	-	NA

EqualityOperatorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.0463 ns	0.0154 ns	0.0137 ns	1.0482 ns	1.0278 ns	1.0698 ns	0.26	0.00	-	NA
EqualityOperatorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	3.9929 ns	0.0114 ns	0.0089 ns	3.9951 ns	3.9766 ns	4.0038 ns	1.00	0.00	-	NA

InequalityOperatorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.0747 ns	0.0339 ns	0.0317 ns	1.0647 ns	1.0322 ns	1.1227 ns	0.28	0.01	-	NA
InequalityOperatorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	3.8736 ns	0.0210 ns	0.0197 ns	3.8790 ns	3.8392 ns	3.9153 ns	1.00	0.00	-	NA

MultiplyByMatrixOperatorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.8015 ns	0.1133 ns	0.1060 ns	7.7771 ns	7.6583 ns	7.9914 ns	0.99	0.01	-	NA
MultiplyByMatrixOperatorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.8457 ns	0.0591 ns	0.0553 ns	7.8291 ns	7.7719 ns	7.9635 ns	1.00	0.00	-	NA

MultiplyByScalarOperatorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.5161 ns	0.0145 ns	0.0129 ns	0.5169 ns	0.4887 ns	0.5312 ns	0.08	0.00	-	NA
MultiplyByScalarOperatorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.3403 ns	0.0669 ns	0.0626 ns	6.3263 ns	6.2635 ns	6.4720 ns	1.00	0.00	-	NA

SubtractOperatorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.3854 ns	0.0180 ns	0.0169 ns	1.3858 ns	1.3638 ns	1.4254 ns	0.16	0.00	-	NA
SubtractOperatorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	8.4184 ns	0.0333 ns	0.0312 ns	8.4219 ns	8.3506 ns	8.4695 ns	1.00	0.00	-	NA

NegationOperatorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3624 ns	0.0186 ns	0.0174 ns	0.3672 ns	0.3383 ns	0.3953 ns	0.05	0.00	-	NA
NegationOperatorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.7008 ns	0.1207 ns	0.1129 ns	6.7055 ns	6.5304 ns	6.9153 ns	1.00	0.00	-	NA

AddBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.3825 ns	0.0165 ns	0.0155 ns	1.3837 ns	1.3611 ns	1.4089 ns	0.19	0.00	-	NA
AddBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.2156 ns	0.0417 ns	0.0348 ns	7.2143 ns	7.1652 ns	7.2991 ns	1.00	0.00	-	NA

CreateBillboardBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	8.9895 ns	0.0349 ns	0.0291 ns	8.9829 ns	8.9512 ns	9.0441 ns	0.89	0.00	-	NA
CreateBillboardBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	10.0997 ns	0.0518 ns	0.0459 ns	10.1030 ns	9.9851 ns	10.1490 ns	1.00	0.00	-	NA

CreateConstrainedBillboardBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	11.3686 ns	0.0691 ns	0.0540 ns	11.3543 ns	11.3251 ns	11.5260 ns	0.87	0.01	-	NA
CreateConstrainedBillboardBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	13.0647 ns	0.1123 ns	0.1050 ns	13.0303 ns	12.9043 ns	13.2049 ns	1.00	0.00	-	NA

CreateFromAxisAngleBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.8911 ns	0.0111 ns	0.0098 ns	1.8895 ns	1.8715 ns	1.9084 ns	0.16	0.00	-	NA
CreateFromAxisAngleBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	11.6160 ns	0.0395 ns	0.0350 ns	11.6142 ns	11.5368 ns	11.6829 ns	1.00	0.00	-	NA

CreateFromQuaternionBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.4393 ns	0.0080 ns	0.0075 ns	0.4392 ns	0.4236 ns	0.4511 ns	0.06	0.00	-	NA
CreateFromQuaternionBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.4897 ns	0.0657 ns	0.0582 ns	7.4866 ns	7.3981 ns	7.6055 ns	1.00	0.00	-	NA

CreateFromYawPitchRollBenchmarkBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	22.3193 ns	0.0852 ns	0.0666 ns	22.2985 ns	22.2481 ns	22.4866 ns	0.79	0.01	-	NA
CreateFromYawPitchRollBenchmarkBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	28.2116 ns	0.3037 ns	0.2692 ns	28.2456 ns	27.7884 ns	28.7217 ns	1.00	0.00	-	NA

CreateLookAtBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	11.4790 ns	0.1363 ns	0.1275 ns	11.4770 ns	11.2995 ns	11.7111 ns	0.90	0.01	-	NA
CreateLookAtBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	12.7395 ns	0.0854 ns	0.0667 ns	12.7488 ns	12.6146 ns	12.8663 ns	1.00	0.00	-	NA

CreateOrthographicBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3336 ns	0.0164 ns	0.0154 ns	0.3326 ns	0.3053 ns	0.3566 ns	0.06	0.00	-	NA
CreateOrthographicBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.8071 ns	0.0845 ns	0.0706 ns	5.8197 ns	5.6732 ns	5.8823 ns	1.00	0.00	-	NA

CreateOrthographicOffCenterBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3123 ns	0.0096 ns	0.0085 ns	0.3159 ns	0.2992 ns	0.3217 ns	0.05	0.00	-	NA
CreateOrthographicOffCenterBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.8799 ns	0.0811 ns	0.0758 ns	5.8671 ns	5.7781 ns	6.0336 ns	1.00	0.00	-	NA

CreatePerspectiveBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3263 ns	0.0156 ns	0.0146 ns	0.3268 ns	0.3066 ns	0.3488 ns	0.05	0.00	-	NA
CreatePerspectiveBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.0034 ns	0.1592 ns	0.1490 ns	6.9492 ns	6.8534 ns	7.3128 ns	1.00	0.00	-	NA

CreatePerspectiveFieldOfViewBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3243 ns	0.0120 ns	0.0112 ns	0.3234 ns	0.3108 ns	0.3455 ns	0.03	0.00	-	NA
CreatePerspectiveFieldOfViewBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	10.7003 ns	0.1630 ns	0.1361 ns	10.6520 ns	10.5363 ns	11.0060 ns	1.00	0.00	-	NA

CreatePerspectiveOffCenterBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3202 ns	0.0165 ns	0.0154 ns	0.3232 ns	0.2986 ns	0.3486 ns	0.05	0.00	-	NA
CreatePerspectiveOffCenterBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.1086 ns	0.0973 ns	0.0910 ns	7.1155 ns	6.9250 ns	7.2359 ns	1.00	0.00	-	NA

CreateReflectionBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	4.8193 ns	0.0894 ns	0.0836 ns	4.8361 ns	4.6652 ns	4.9516 ns	0.59	0.02	-	NA
CreateReflectionBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	8.2192 ns	0.1659 ns	0.1552 ns	8.1902 ns	7.8945 ns	8.5008 ns	1.00	0.00	-	NA

CreateRotationXBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3146 ns	0.0158 ns	0.0148 ns	0.3147 ns	0.2937 ns	0.3409 ns	0.05	0.00	-	NA
CreateRotationXBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.8153 ns	0.0736 ns	0.0688 ns	5.8154 ns	5.7112 ns	5.9499 ns	1.00	0.00	-	NA

CreateRotationXWithCenterBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.6149 ns	0.0175 ns	0.0163 ns	0.6152 ns	0.5953 ns	0.6425 ns	0.10	0.00	-	NA
CreateRotationXWithCenterBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.1406 ns	0.0642 ns	0.0600 ns	6.1365 ns	6.0105 ns	6.2329 ns	1.00	0.00	-	NA

CreateRotationYBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3107 ns	0.0158 ns	0.0140 ns	0.3123 ns	0.2970 ns	0.3412 ns	0.05	0.00	-	NA
CreateRotationYBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.8733 ns	0.0612 ns	0.0573 ns	5.8874 ns	5.7797 ns	5.9729 ns	1.00	0.00	-	NA

CreateRotationYWithCenterBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.4479 ns	0.0225 ns	0.0199 ns	0.4489 ns	0.4208 ns	0.4829 ns	0.07	0.00	-	NA
CreateRotationYWithCenterBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.9741 ns	0.1260 ns	0.1179 ns	5.9466 ns	5.8418 ns	6.1852 ns	1.00	0.00	-	NA

CreateRotationZBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3078 ns	0.0150 ns	0.0140 ns	0.3164 ns	0.2901 ns	0.3261 ns	0.05	0.00	-	NA
CreateRotationZBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.7642 ns	0.0743 ns	0.0621 ns	5.7805 ns	5.6699 ns	5.8536 ns	1.00	0.00	-	NA

CreateRotationZWithCenterBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.4319 ns	0.0182 ns	0.0161 ns	0.4247 ns	0.4191 ns	0.4686 ns	0.07	0.00	-	NA
CreateRotationZWithCenterBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.8833 ns	0.0946 ns	0.0885 ns	5.9124 ns	5.7486 ns	6.0152 ns	1.00	0.00	-	NA

CreateScaleFromVectorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.4475 ns	0.0206 ns	0.0193 ns	0.4520 ns	0.4169 ns	0.4739 ns	0.07	0.00	-	NA
CreateScaleFromVectorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.6259 ns	0.0676 ns	0.0633 ns	6.6188 ns	6.5377 ns	6.7354 ns	1.00	0.00	-	NA

CreateScaleFromVectorWithCenterBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.7149 ns	0.0316 ns	0.0311 ns	0.7092 ns	0.6183 ns	0.7498 ns	0.11	0.01	-	NA
CreateScaleFromVectorWithCenterBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.5183 ns	0.3281 ns	0.3778 ns	6.6314 ns	5.3836 ns	6.7184 ns	1.00	0.00	-	NA

CreateScaleFromScalarBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3274 ns	0.0132 ns	0.0123 ns	0.3285 ns	0.3035 ns	0.3503 ns	0.06	0.00	-	NA
CreateScaleFromScalarBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.3300 ns	0.0372 ns	0.0330 ns	5.3229 ns	5.2858 ns	5.4090 ns	1.00	0.00	-	NA

CreateScaleFromScalarWithCenterBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.4449 ns	0.0174 ns	0.0162 ns	0.4409 ns	0.4233 ns	0.4781 ns	0.07	0.00	-	NA
CreateScaleFromScalarWithCenterBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.7398 ns	0.0379 ns	0.0354 ns	6.7354 ns	6.6776 ns	6.8134 ns	1.00	0.00	-	NA

CreateScaleFromScalarXYZBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3607 ns	0.0029 ns	0.0024 ns	0.3607 ns	0.3571 ns	0.3650 ns	0.06	0.00	-	NA
CreateScaleFromScalarXYZBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.2929 ns	0.0138 ns	0.0115 ns	6.2922 ns	6.2760 ns	6.3118 ns	1.00	0.00	-	NA

CreateScaleFromScalarXYZWithCenterBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3391 ns	0.0046 ns	0.0038 ns	0.3381 ns	0.3344 ns	0.3481 ns	0.05	0.00	-	NA
CreateScaleFromScalarXYZWithCenterBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.3082 ns	0.0305 ns	0.0270 ns	6.3145 ns	6.2594 ns	6.3466 ns	1.00	0.00	-	NA

CreateShadowBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.9657 ns	0.0848 ns	0.0793 ns	5.9953 ns	5.8548 ns	6.0817 ns	0.67	0.02	-	NA
CreateShadowBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	8.9197 ns	0.1424 ns	0.1332 ns	8.9224 ns	8.6955 ns	9.2007 ns	1.00	0.00	-	NA

CreateTranslationFromVectorBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.5029 ns	0.0102 ns	0.0091 ns	0.5011 ns	0.4890 ns	0.5197 ns	0.12	0.00	-	NA
CreateTranslationFromVectorBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	4.2677 ns	0.0993 ns	0.0928 ns	4.2776 ns	4.1042 ns	4.4482 ns	1.00	0.00	-	NA

CreateTranslationFromScalarXYZ	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3198 ns	0.0060 ns	0.0050 ns	0.3196 ns	0.3089 ns	0.3272 ns	0.07	0.00	-	NA
CreateTranslationFromScalarXYZ	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	4.2804 ns	0.0330 ns	0.0293 ns	4.2698 ns	4.2474 ns	4.3335 ns	1.00	0.00	-	NA

CreateWorldBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	9.0783 ns	0.0204 ns	0.0190 ns	9.0804 ns	9.0384 ns	9.1063 ns	0.91	0.01	-	NA
CreateWorldBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	9.9490 ns	0.1196 ns	0.1118 ns	9.9088 ns	9.8051 ns	10.1024 ns	1.00	0.00	-	NA

DecomposeBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	11.280 ns	0.0250 ns	0.0220 ns	11.2800 ns	11.2400 ns	11.3100 ns	0.85		-	NA
DecomposeBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	13.300 ns	0.1020 ns	0.0950 ns	13.3000 ns	13.1700 ns	13.4800 ns	1.00		-	NA

EqualsBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	2.0177 ns	0.0177 ns	0.0157 ns	2.0103 ns	2.0036 ns	2.0489 ns	0.59	0.01	-	NA
EqualsBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	3.4028 ns	0.0330 ns	0.0276 ns	3.3997 ns	3.3657 ns	3.4665 ns	1.00	0.00	-	NA

GetDeterminantBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	4.6025 ns	0.0447 ns	0.0396 ns	4.5815 ns	4.5720 ns	4.6981 ns	0.82	0.01	-	NA
GetDeterminantBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	5.6268 ns	0.0515 ns	0.0482 ns	5.6173 ns	5.5592 ns	5.7081 ns	1.00	0.00	-	NA

InvertBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.7855 ns	0.0412 ns	0.0366 ns	7.7730 ns	7.7456 ns	7.8545 ns	0.87	0.01	-	NA
InvertBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	8.9029 ns	0.0909 ns	0.0850 ns	8.8822 ns	8.7785 ns	9.0647 ns	1.00	0.00	-	NA

LerpBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.5310 ns	0.0173 ns	0.0144 ns	1.5334 ns	1.5047 ns	1.5544 ns	0.18	0.00	-	NA
LerpBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	8.5167 ns	0.0811 ns	0.0719 ns	8.4969 ns	8.4284 ns	8.6710 ns	1.00	0.00	-	NA

MultiplyByMatrixBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.9887 ns	0.1083 ns	0.1013 ns	8.0464 ns	7.7904 ns	8.0950 ns	1.13	0.02	-	NA
MultiplyByMatrixBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.0450 ns	0.0554 ns	0.0518 ns	7.0302 ns	6.9720 ns	7.1450 ns	1.00	0.00	-	NA

MultiplyByScalarBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.5069 ns	0.0169 ns	0.0150 ns	0.5029 ns	0.4923 ns	0.5445 ns	0.08	0.00	-	NA
MultiplyByScalarBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.5937 ns	0.0859 ns	0.0803 ns	6.5834 ns	6.4279 ns	6.7496 ns	1.00	0.00	-	NA

NegateBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	0.3783 ns	0.0179 ns	0.0167 ns	0.3738 ns	0.3494 ns	0.4042 ns	0.06	0.00	-	NA
NegateBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.7202 ns	0.1430 ns	0.1338 ns	6.7269 ns	6.5364 ns	6.9876 ns	1.00	0.00	-	NA

SubtractBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	1.3814 ns	0.0294 ns	0.0275 ns	1.3656 ns	1.3528 ns	1.4471 ns	0.19	0.00	-	NA
SubtractBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.1556 ns	0.0450 ns	0.0421 ns	7.1742 ns	7.0941 ns	7.2032 ns	1.00	0.00	-	NA

TransformBenchmark	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	8.6916 ns	0.0941 ns	0.0880 ns	8.6864 ns	8.5753 ns	8.8363 ns	1.10	0.01	-	NA
TransformBenchmark	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	7.8839 ns	0.0683 ns	0.0638 ns	7.8759 ns	7.7661 ns	7.9791 ns	1.00	0.00	-	NA

Transpose	Job-PGRVDA	\runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.0994 ns	0.0870 ns	0.0771 ns	6.0866 ns	6.0013 ns	6.2512 ns	0.96	0.02	-	NA
Transpose	Job-FNYOZA	\runtime_base\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\corerun.exe	6.3162 ns	0.1108 ns	0.1037 ns	6.3172 ns	6.1311 ns	6.4566 ns	1.00	0.00	-	NA

tannergooding · 2023-01-02T05:41:27Z

Codegen for many of these will improve even more once #80083 goes in, since it improves the loading/storing of Vector3 (TYP_SIMD12)

tannergooding · 2023-01-02T05:50:28Z

An example diff for Matrix4x4:get_IsIdentity() is below.

As you can see, previously we did 16 separate comparisons and at some 32 branches. Now we instead do 4 direct comparisons against the identity constant and no more than 4 branches.

Before:

; Assembly listing for method System.Numerics.Matrix4x4:get_IsIdentity():bool:this
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 this         [V00,T00] ( 18, 10.50)   byref  ->  rcx         this single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;  V02 cse0         [V02,T01] (  5,  3.50)   float  ->  mm1         "CSE - aggressive"
;
; Lcl frame size = 0

G_M52050_IG01:
       vzeroupper 
						;; size=3 bbWeight=1    PerfScore 1.00
G_M52050_IG02:
       vmovss   xmm0, dword ptr [rcx]
       vmovss   xmm1, dword ptr [reloc @RWD00]
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
						;; size=28 bbWeight=1    PerfScore 11.00
G_M52050_IG03:
       vmovss   xmm0, dword ptr [rcx+14H]
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+28H]
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+3CH]
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+04H]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+08H]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+0CH]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+10H]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+18H]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       G_M52050_IG06
       jne      G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+1CH]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       SHORT G_M52050_IG06
						;; size=203 bbWeight=0.50 PerfScore 36.50
G_M52050_IG04:
       jne      SHORT G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+20H]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       SHORT G_M52050_IG06
       jne      SHORT G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+24H]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       SHORT G_M52050_IG06
       jne      SHORT G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+2CH]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       SHORT G_M52050_IG06
       jne      SHORT G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+30H]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       SHORT G_M52050_IG06
       jne      SHORT G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+34H]
       vxorps   xmm1, xmm1
       vucomiss xmm0, xmm1
       jp       SHORT G_M52050_IG06
       jne      SHORT G_M52050_IG06
       vmovss   xmm0, dword ptr [rcx+38H]
       vxorps   xmm1, xmm1
       xor      eax, eax
       vucomiss xmm0, xmm1
       setnp    al
       jp       SHORT G_M52050_IG05
       sete     al
						;; size=110 bbWeight=0.50 PerfScore 26.13
G_M52050_IG05:
       ret      
						;; size=1 bbWeight=0.50 PerfScore 0.50
G_M52050_IG06:
       xor      eax, eax
						;; size=2 bbWeight=0.50 PerfScore 0.12
G_M52050_IG07:
       ret      
						;; size=1 bbWeight=0.50 PerfScore 0.50
RWD00  	dd	3F800000h		;         1


; Total bytes of code 348, prolog size 3, PerfScore 110.55, instruction count 83, allocated bytes for code 348 (MethodHash=dc9134ad) for method System.Numerics.Matrix4x4:get_IsIdentity():bool:this

After:

; Assembly listing for method System.Numerics.Matrix4x4:get_IsIdentity():bool:this
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; 0 inlinees with PGO data; 2 single block inlinees; 1 inlinees without PGO data
; Final local variable assignments
;
;  V00 this         [V00,T02] (  3,  3   )   byref  ->  rcx         this single-def
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+00H]   "OutgoingArgSpace"
;  V02 tmp1         [V02,T03] (  2,  4   )  struct (64) [rsp+48H]   do-not-enreg[SF] "impAppendStmt"
;* V03 tmp2         [V03    ] (  0,  0   )  struct (64) zero-ref    do-not-enreg[S] "struct address for call/obj"
;* V04 tmp3         [V04,T00] (  0,  0   )  struct (64) zero-ref    do-not-enreg[SF] ld-addr-op "NewObj constructor temp"
;* V05 tmp4         [V05,T10] (  0,  0   )  simd16  ->  zero-ref    ld-addr-op "NewObj constructor temp"
;* V06 tmp5         [V06,T11] (  0,  0   )  simd16  ->  zero-ref    ld-addr-op "NewObj constructor temp"
;* V07 tmp6         [V07,T12] (  0,  0   )  simd16  ->  zero-ref    ld-addr-op "NewObj constructor temp"
;* V08 tmp7         [V08,T13] (  0,  0   )  simd16  ->  zero-ref    ld-addr-op "NewObj constructor temp"
;  V09 tmp8         [V09,T04] (  3,  2   )    bool  ->  rax         "Inline return value spill temp"
;  V10 tmp9         [V10,T01] (  5,  7   )  struct (64) [rsp+08H]   do-not-enreg[SF] "Inlining Arg"
;* V11 tmp10        [V11,T05] (  0,  0   )  struct (64) zero-ref    do-not-enreg[SF] "Inlining Arg"
;  V12 cse0         [V12,T06] (  2,  2   )  simd16  ->  mm0         "CSE - aggressive"
;  V13 cse1         [V13,T07] (  2,  1.50)  simd16  ->  mm1         "CSE - aggressive"
;  V14 cse2         [V14,T08] (  2,  1.50)  simd16  ->  mm2         "CSE - aggressive"
;  V15 cse3         [V15,T09] (  2,  1.50)  simd16  ->  mm3         "CSE - aggressive"
;
; Lcl frame size = 136

G_M52050_IG01:
       sub      rsp, 136
       vzeroupper 
						;; size=10 bbWeight=1    PerfScore 1.25
G_M52050_IG02:
       vmovdqu  ymm0, ymmword ptr[rcx]
       vmovdqu  ymmword ptr[rsp+48H], ymm0
       vmovdqu  ymm0, ymmword ptr[rcx+20H]
       vmovdqu  ymmword ptr[rsp+68H], ymm0
       vmovupd  xmm0, xmmword ptr [reloc @RWD00]
       vmovupd  xmm1, xmmword ptr [reloc @RWD16]
       vmovupd  xmm2, xmmword ptr [reloc @RWD32]
       vmovupd  xmm3, xmmword ptr [reloc @RWD48]
       vmovdqu  ymm4, ymmword ptr[rsp+48H]
       vmovdqu  ymmword ptr[rsp+08H], ymm4
       vmovdqu  ymm4, ymmword ptr[rsp+68H]
       vmovdqu  ymmword ptr[rsp+28H], ymm4
       vcmpps   xmm0, xmm0, xmmword ptr [rsp+08H], 0
       vmovmskps xrax, xmm0
       cmp      eax, 15
       jne      SHORT G_M52050_IG04
						;; size=93 bbWeight=1    PerfScore 40.25
G_M52050_IG03:
       vcmpps   xmm0, xmm1, xmmword ptr [rsp+18H], 0
       vmovmskps xrax, xmm0
       cmp      eax, 15
       jne      SHORT G_M52050_IG04
       vcmpps   xmm0, xmm2, xmmword ptr [rsp+28H], 0
       vmovmskps xrax, xmm0
       cmp      eax, 15
       jne      SHORT G_M52050_IG04
       vcmpps   xmm0, xmm3, xmmword ptr [rsp+38H], 0
       vmovmskps xrax, xmm0
       cmp      eax, 15
       sete     al
       movzx    rax, al
       jmp      SHORT G_M52050_IG05
						;; size=54 bbWeight=0.50 PerfScore 10.50
G_M52050_IG04:
       xor      eax, eax
						;; size=2 bbWeight=0.50 PerfScore 0.12
G_M52050_IG05:
       add      rsp, 136
       ret      
						;; size=8 bbWeight=1    PerfScore 1.25
RWD00  	dq	000000003F800000h, 0000000000000000h
RWD16  	dq	3F80000000000000h, 0000000000000000h
RWD32  	dq	0000000000000000h, 000000003F800000h
RWD48  	dq	0000000000000000h, 3F80000000000000h


; Total bytes of code 167, prolog size 10, PerfScore 70.08, instruction count 35, allocated bytes for code 167 (MethodHash=dc9134ad) for method System.Numerics.Matrix4x4:get_IsIdentity():bool:this

tannergooding · 2023-01-02T05:59:43Z

Still need to look into a couple edge cases where perf minorly regressed. Disassembly looks to show it being from extra copies where we'd ideally have none.

tannergooding · 2023-01-03T20:14:46Z

@radekdoulik would you be able to assist (or ping someone from the WASM team who can) with the failure here?

Seeing the following in the logs, but don't see an existing issue for it.

/datadisks/disk1/work/A3510907/p/build/wasm/WasmApp.Native.targets(296,5): warning WASM0001: Could not get pinvoke, or callbacks for method 'System.Tests.TimeZoneInfoTests+WindowsUILanguageHelper::EnumUILanguages' because 'Parsing function pointer types in signatures is not supported.' [/datadisks/disk1/work/A3510907/w/AE2B09C8/e/publish/ProxyProjectForAOTOnHelix.proj]
emcc : error : '/datadisks/disk1/work/A3510907/p/build/emsdk/upstream/bin/wasm-emscripten-finalize -g --dyncalls-i64 --dwarf /datadisks/disk1/work/A3510907/w/AE2B09C8/e/wasm_build/obj/wasm/for-build/dotnet.wasm -o /datadisks/disk1/work/A3510907/w/AE2B09C8/e/wasm_build/obj/wasm/for-build/dotnet.wasm --detect-features' failed (received SIGKILL (-9)) [/datadisks/disk1/work/A3510907/w/AE2B09C8/e/publish/ProxyProjectForAOTOnHelix.proj]
/datadisks/disk1/work/A3510907/p/build/wasm/WasmApp.Native.targets(431,5): error MSB3073: The command "emcc "@/datadisks/disk1/work/A3510907/p/build/microsoft.netcore.app.runtime.browser-wasm/runtimes/browser-wasm/native/src/emcc-default.rsp" "@/datadisks/disk1/work/A3510907/p/build/microsoft.netcore.app.runtime.browser-wasm/runtimes/browser-wasm/native/src/emcc-link.rsp" "@/datadisks/disk1/work/A3510907/w/AE2B09C8/e/wasm_build/obj/wasm/for-build/emcc-link.rsp"" exited with code 1. [/datadisks/disk1/work/A3510907/w/AE2B09C8/e/publish/ProxyProjectForAOTOnHelix.proj]

It's unexpected that changing matrix would impact the System.Runtime tests and not others.

dakersnar

LGTM at a high level, as most of this seems like a direct copy paste to the new class. Is there anything besides the operator + change that requires a closer look?

tannergooding · 2023-01-06T20:24:17Z

Is there anything besides the operator + change that requires a closer look?

Not particularly, this all pretty straightforward refactoring + simplification.

radekdoulik · 2023-01-09T14:52:15Z

@radekdoulik would you be able to assist (or ping someone from the WASM team who can) with the failure here?

Seeing the following in the logs, but don't see an existing issue for it.

/datadisks/disk1/work/A3510907/p/build/wasm/WasmApp.Native.targets(296,5): warning WASM0001: Could not get pinvoke, or callbacks for method 'System.Tests.TimeZoneInfoTests+WindowsUILanguageHelper::EnumUILanguages' because 'Parsing function pointer types in signatures is not supported.' [/datadisks/disk1/work/A3510907/w/AE2B09C8/e/publish/ProxyProjectForAOTOnHelix.proj]
emcc : error : '/datadisks/disk1/work/A3510907/p/build/emsdk/upstream/bin/wasm-emscripten-finalize -g --dyncalls-i64 --dwarf /datadisks/disk1/work/A3510907/w/AE2B09C8/e/wasm_build/obj/wasm/for-build/dotnet.wasm -o /datadisks/disk1/work/A3510907/w/AE2B09C8/e/wasm_build/obj/wasm/for-build/dotnet.wasm --detect-features' failed (received SIGKILL (-9)) [/datadisks/disk1/work/A3510907/w/AE2B09C8/e/publish/ProxyProjectForAOTOnHelix.proj]
/datadisks/disk1/work/A3510907/p/build/wasm/WasmApp.Native.targets(431,5): error MSB3073: The command "emcc "@/datadisks/disk1/work/A3510907/p/build/microsoft.netcore.app.runtime.browser-wasm/runtimes/browser-wasm/native/src/emcc-default.rsp" "@/datadisks/disk1/work/A3510907/p/build/microsoft.netcore.app.runtime.browser-wasm/runtimes/browser-wasm/native/src/emcc-link.rsp" "@/datadisks/disk1/work/A3510907/w/AE2B09C8/e/wasm_build/obj/wasm/for-build/emcc-link.rsp"" exited with code 1. [/datadisks/disk1/work/A3510907/w/AE2B09C8/e/publish/ProxyProjectForAOTOnHelix.proj]

It's unexpected that changing matrix would impact the System.Runtime tests and not others.

You can ignore that, it is known issue.#79874

EgorBo · 2023-01-10T17:52:13Z

Improvements
on Linux-x64:

[Perf] Linux/x64: 78 Improvements on 1/7/2023 12:45:06 AM perf-autofiling-issues#11442

on Windows-x64:

[Perf] Windows/x64: 81 Improvements on 1/7/2023 12:45:06 AM perf-autofiling-issues#11467
[Perf] Windows/x64: 18 Improvements on 1/7/2023 12:45:06 AM perf-autofiling-issues#11480

ghost assigned tannergooding Jan 2, 2023

dotnet-issue-labeler bot added the area-System.Numerics label Jan 2, 2023

tannergooding force-pushed the better-matrix branch 2 times, most recently from 27fb5dc to b8b97a6 Compare January 2, 2023 21:37

tannergooding added 2 commits January 2, 2023 18:58

Rewrite how Matrix3x2 and Matrix4x4 are implemented

b85156c

Fix a bug in lowerxarch related to merging Sse41.Insert chains

51bb6c3

tannergooding force-pushed the better-matrix branch from b8b97a6 to 51bb6c3 Compare January 3, 2023 02:58

This was referenced Jan 3, 2023

emcc received SIGKILL #79874

Closed

Test failure Loader\\classloader\\DictionaryExpansion\\DictionaryExpansion\\DictionaryExpansion.cmd #75244

Closed

Tracking issue for CI build timeouts #76454

Closed

tannergooding marked this pull request as ready for review January 3, 2023 15:37

tannergooding requested a review from dakersnar January 3, 2023 15:37

Merge remote-tracking branch 'dotnet/main' into better-matrix

3432608

Merge remote-tracking branch 'dotnet/main' into better-matrix

b705823

dakersnar approved these changes Jan 6, 2023

View reviewed changes

tannergooding merged commit f8218f9 into dotnet:main Jan 7, 2023

tannergooding deleted the better-matrix branch January 7, 2023 00:46

This was referenced Jan 10, 2023

[Perf] Windows/x64: 1 Regression on 1/7/2023 12:45:06 AM dotnet/perf-autofiling-issues#11446

Closed

[Perf] Windows/x64: 1 Regression on 1/7/2023 12:45:06 AM dotnet/perf-autofiling-issues#11473

Closed

tannergooding mentioned this pull request Jan 10, 2023

Performance optimization opportunities in common pixel formats. SixLabors/ImageSharp#2232

Closed

4 tasks

lewing mentioned this pull request Jan 17, 2023

[Perf] Linux/x64: 242 Regressions on 1/12/2023 10:41:19 PM dotnet/perf-autofiling-issues#11829

Open

tannergooding mentioned this pull request Jan 31, 2023

Improve the codegen of the vector accelerated System.Numerics.* types #81335

Merged

tannergooding mentioned this pull request Feb 8, 2023

What's new in .NET 8 Preview 1 dotnet/core#8133

Closed

3 tasks

ghost locked as resolved and limited conversation to collaborators Feb 9, 2023

jeffhandley added the blog-candidate Completed PRs that are candidate topics for blog post coverage label Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite how Matrix3x2 and Matrix4x4 are implemented #80091

Rewrite how Matrix3x2 and Matrix4x4 are implemented #80091

tannergooding commented Jan 2, 2023 •

edited

Loading

ghost commented Jan 2, 2023

tannergooding commented Jan 2, 2023 •

edited

Loading

tannergooding commented Jan 2, 2023 •

edited

Loading

tannergooding commented Jan 2, 2023

tannergooding commented Jan 2, 2023

tannergooding commented Jan 2, 2023 •

edited

Loading

tannergooding commented Jan 3, 2023

dakersnar left a comment

tannergooding commented Jan 6, 2023

radekdoulik commented Jan 9, 2023

EgorBo commented Jan 10, 2023 •

edited

Loading

Rewrite how Matrix3x2 and Matrix4x4 are implemented #80091

Rewrite how Matrix3x2 and Matrix4x4 are implemented #80091

Conversation

tannergooding commented Jan 2, 2023 • edited Loading

ghost commented Jan 2, 2023

tannergooding commented Jan 2, 2023 • edited Loading

Perf_Matrix3x2

tannergooding commented Jan 2, 2023 • edited Loading

Perf_Matrix4x4

tannergooding commented Jan 2, 2023

tannergooding commented Jan 2, 2023

tannergooding commented Jan 2, 2023 • edited Loading

tannergooding commented Jan 3, 2023

dakersnar left a comment

Choose a reason for hiding this comment

tannergooding commented Jan 6, 2023

radekdoulik commented Jan 9, 2023

EgorBo commented Jan 10, 2023 • edited Loading

tannergooding commented Jan 2, 2023 •

edited

Loading

tannergooding commented Jan 2, 2023 •

edited

Loading

tannergooding commented Jan 2, 2023 •

edited

Loading

tannergooding commented Jan 2, 2023 •

edited

Loading

EgorBo commented Jan 10, 2023 •

edited

Loading