-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Conversation
Wasn't |
Yes, but it is not used for the HW intrinsic vectors, and even for the other SIMD types, it is not set if the vector is only copied or passed/returned. |
Hmm, are the |
Ah, I missed that. Yes, I think they will now be unnecessary, though one could consider that an optimization of sorts, since it will bypass the second check in the struct promotion test. |
714c47e
to
0197694
Compare
Might it be worthwhile to enforce we set the flag and |
I'm not sure what flag or "other code path" you're referring to? |
I meant keeping But, that wouldn't work for the cases where someone uses |
@dotnet-bot test Windows_NT x86 min_opt ryujit CoreCLR Perf Tests Correctness please |
The hardware vector types: `Vector64<T>`, `Vector128<T>` and `Vector256<T>` are declared as having one or more fields of `ulong`. However, the JIT shouldn't be promoting these fields to local variables. It is almost never the right type, and the intrinsics in any case are not designed to cooperate with promoted fields (i.e. an index of a `Vector<ulong>` won't map to the promoted lclVar). Most importantly, it causes all copies of the vector to be done as 64-bit integer loads and stores. Finally, it will be important, as we support vector ABIs, to distinguish the handling of the fixed-size vectors (`Vector2`, `Vector3` and `Vector4`) which *are* considered to be normal structs of N floats, from the opaque types which will be passed in vector registers.
0197694
to
f06134e
Compare
@dotnet/jit-contrib PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fix a GitHub issue?
No, I don't believe so. I discovered this when I was seeing unexpected (poor) codegen while working on #21304. I should probably check some of the struct-related issues to see if any of them are impacted by this. I have had some system issues so haven't been able to run full diffs. On linux, I get 113 bytes (-0.002%) reduction in SPC.dll. There are 3 methods (actually 3 instantiations of the same method) that regressed:
Clearly there's room for improved heuristics such that we could treat these "as if" they are promoted to individual fields of the base type if merited. But the current approach is quite pessimizing. |
OK, I finally had a chance to run diffs on Windows (full frameworks and tests this time), and spot-check the results:
Overall, the reductions are places where we eliminated struct copies of arguments, and replaced pairs of 16-byte moves from memory with 32-byte moves from memory or register. The regressions are variations on:
|
- Enable CSE of struct values when handle is available (and add code to get the handle of HW SIMD types) - Don't require block nodes for SIMD assignments - Don't set `GTF_GLOB_REF` on `GT_OBJ` if it is local - Set `lvRegStruct` on promoted SIMD fields - Add tests for #19910 (fixed with this PR) and #3539 & #19910 (fixed with dotnet#21314) - Additional cleanup
- Enable CSE of struct values when handle is available (and add code to get the handle of HW SIMD types) - Don't require block nodes for SIMD assignments - Don't set `GTF_GLOB_REF` on `GT_OBJ` if it is local - Set `lvRegStruct` on promoted SIMD fields - Add tests for #19910 (fixed with this PR) and #3539 & #19438 (fixed with dotnet#21314) - Additional cleanup Fix #19910
- Enable CSE of struct values when handle is available (and add code to get the handle of HW SIMD types) - Don't require block nodes for SIMD assignments - Don't set `GTF_GLOB_REF` on `GT_OBJ` if it is local - Set `lvRegStruct` on promoted SIMD fields - Add tests for #19910 (fixed with this PR) and #3539 & #19438 (fixed with dotnet#21314) - Additional cleanup Fix #19910
- Enable CSE of struct values when handle is available (and add code to get the handle of HW SIMD types) - Don't require block nodes for SIMD assignments - Don't set `GTF_GLOB_REF` on `GT_OBJ` if it is local - Set `lvRegStruct` on promoted SIMD fields - Add tests for #19910 (fixed with this PR) and #3539 & #19438 (fixed with dotnet#21314) - Additional cleanup Fix #19910
* [WIP] Struct & SIMD improvements - Enable CSE of struct values when handle is available (and add code to get the handle of HW SIMD types) - Don't require block nodes for SIMD assignments - Don't set `GTF_GLOB_REF` on `GT_OBJ` if it is local - Set `lvRegStruct` on promoted SIMD fields - Add tests for #19910 (fixed with this PR) and #3539 & #19438 (fixed with #21314) - Additional cleanup Fix #19910
…ector Don't struct-promote opaque vectors Commit migrated from dotnet/coreclr@61da68e
* [WIP] Struct & SIMD improvements - Enable CSE of struct values when handle is available (and add code to get the handle of HW SIMD types) - Don't require block nodes for SIMD assignments - Don't set `GTF_GLOB_REF` on `GT_OBJ` if it is local - Set `lvRegStruct` on promoted SIMD fields - Add tests for dotnet/coreclr#19910 (fixed with this PR) and dotnet/coreclr#3539 & dotnet/coreclr#19438 (fixed with dotnet/coreclr#21314) - Additional cleanup Fix dotnet/coreclr#19910 Commit migrated from dotnet/coreclr@3d4a1d5
The hardware vector types:
Vector64<T>
,Vector128<T>
andVector256<T>
are declaredas having one or more fields of
ulong
. However, the JIT shouldn't be promoting thesefields to local variables. It is almost never the right type, and the intrinsics in any
case are not designed to cooperate with promoted fields (i.e. an index of a
Vector<ulong>
won't map to the promoted lclVar).
Most importantly, it causes all copies of the vector to be done as 64-bit integer loads
and stores.
Finally, it will be important, as we support vector ABIs, to distinguish the handling of
the fixed-size vectors (
Vector2
,Vector3
andVector4
) which are considered to benormal structs of N floats, from the opaque types which will be passed in vector registers.