Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move arithmetic DivideByZero and Overflow exceptions to JIT on 32-bit platforms #110226

Open
am11 opened this issue Nov 27, 2024 · 14 comments
Open
Assignees
Labels
arch-arm32 arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@am11
Copy link
Member

am11 commented Nov 27, 2024

From @jkotas #109087 (comment)

The extra call frame introduced by the changes in this PR is where the overhead is.

main: managed method -> FCall (with argument checks that require HMF) -> internal C-runtime div helper

PR (notice the extra frame): managed method -> managed helper call (argument checks) -> FCall (without argument checks) -> internal C-runtime div helper

Inlined checks: managed method (with argument checks) -> FCall (without argument checks) -> internal C-runtime div helper

On 32 bit platforms such as x86 and arm32, JIT uses software fallback CORINFO_HELP_{,U,L,UL}{DIV,MOD} to handle DivideByZero and Overflow exceptions in div/mod arithmetic ops via FCThrow in jithelpers.cpp. FCThrow uses HELPER_METHOD_FRAME which we are trying to remove from runtime (#95695).

JIT can handle inserting the software fallback using the existing CORINFO_HELP_OVERFLOW and CORINFO_HELP_THROWDIVZERO helpers on 32 bit platforms. This will remove 8 (out of 9) remaining FCThrow calls from jithelpers.cpp.

@am11 am11 added arch-arm32 arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Nov 27, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Nov 27, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jkotas
Copy link
Member

jkotas commented Nov 27, 2024

@EgorBo Any chance you can take a look at this?

This is contributing to our larger runtime simplification and unifications efforts.

@jkotas jkotas added this to the 10.0.0 milestone Nov 27, 2024
@EgorBo EgorBo self-assigned this Nov 27, 2024
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Nov 27, 2024
@EgorBo
Copy link
Member

EgorBo commented Nov 27, 2024

@EgorBo Any chance you can take a look at this?

This is contributing to our larger runtime simplification and unifications efforts.

Can we just introduce managed helpers? (with qcalls into native helpers for actual division, or maybe even fully managed).

@MichalPetryka
Copy link
Contributor

@EgorBo Any chance you can take a look at this?
This is contributing to our larger runtime simplification and unifications efforts.

Can we just introduce managed helpers? (with qcalls into native helpers for actual division, or maybe even fully managed).

That's what #109087 tried.

@am11
Copy link
Member Author

am11 commented Nov 27, 2024

@EgorBo I was wondering if we can handle it in JIT itself, like we are checking the conditions here:

https://github.com/dotnet/runtime/blob/afdd68bf7cbd3128e1a86def2d5f9789591a1d2d/src/coreclr/jit/gentree.cpp#L16104-L16110

@EgorBo
Copy link
Member

EgorBo commented Nov 27, 2024

That's what #109087 tried.

Ah, short memory

@EgorBo I was wondering if we can handle it in JIT itself, like we are checking the conditions here:

We can definitely do this in JIT, just turns out to be not entirely single-line change as I hoped 🙂

@EgorBo
Copy link
Member

EgorBo commented Nov 28, 2024

@jkotas @am11 it looks like we need to emit quite a bit of control-flow to re-create what these conditions do https://github.com/am11/runtime/blob/0e61d9d6ca5c312443edf175e7d7b6be72077c00/src/libraries/System.Private.CoreLib/src/System/Math.DivModInt.cs#L55-L78 (4 different helpers + extra helpers on arm32) and there is little we can re-use from existing arm64 impl.

After discussing this with @jakobbotsch, we wonder if we can:

  1. Inline managed helpers in JIT (thus, some conditions could be folded or fast path (32bit divison) can be performed without call overhead)
  2. Replace FCall with a direct pinvoke (without gc transition) to libc/whatever native helper we end up using?

also, I've kicked off a benchmark on 32bit windows to see performance impact of am11 impl: #109087 (comment)

@jkotas
Copy link
Member

jkotas commented Nov 28, 2024

Inline managed helpers in JIT (thus, some conditions could be folded or fast path (32bit divison) can be performed without call overhead)

Do you mean to build a general-purpose feature to inline JIT helpers that happen to be written in C#?

Replace FCall with a direct pinvoke (without gc transition) to libc/whatever native helper we end up using?

It would require depending on undocumented C runtime details that's problematic. I do not think we need it to avoid regressions. We are paying for the FCall wrapper today. We do not need to be doing extra work to improve x86.

also, I've kicked off a benchmark on 32bit windows to see performance impact of am11 impl

It shows 1.7x regression. It is more than the earlier measurements that showed about 30% regression.

@EgorBo
Copy link
Member

EgorBo commented Nov 28, 2024

It shows 1.7x regression. It is more than the #109087 (comment) that showed about 30% regression.

It looks like the codegen for the C# version is not perfect (and e.g. Is32bitSigned is not inlined). Not sure we want to invest into peepholes for 32bit, though.

Do you mean to build a general-purpose feature to inline JIT helpers that happen to be written in C#?

Yep, I think we wanted that elsewhere too

Overall, I don't have a strong opinion on which path to choose, but looks like all of them involve quite a bit of work:

  1. Emit jumps/conditions in jit emitter - unlike arm64, these involve a bit more code, e.g. here is codegen for x86 in C++: https://godbolt.org/z/zEzGKjfEr The downside of this approach is that we need to repeat the logic for arm32 as well + not sure, perphas RISC-V/LA64 use these helpers too?
  2. Emit conditions in some crossplatform manner in JIT, e.g. similar to static-ctor-expansion (involves even more efforts actually)
  3. Optimize managed implementations + 32bit-specific peepholes.

Probably, the 1st one indeed is the simplest (if we also avoid smart optimizations with "is it actually 32bit value")

@am11
Copy link
Member Author

am11 commented Nov 28, 2024

perphas RISC-V/LA64 use these helpers too?

Div/Mod helpers in jithelpers.cpp are only called on 32-bit platforms. RV64 and LA64 seem to have a little bit different morphing:

#if defined(TARGET_ARM64) || defined(TARGET_LOONGARCH64) || defined(TARGET_RISCV64)

@jkotas
Copy link
Member

jkotas commented Nov 29, 2024

Looks like you settled on inlining the helpers as IL. Sounds good to me!

@EgorBo
Copy link
Member

EgorBo commented Nov 29, 2024

Looks like you settled on inlining the helpers as IL. Sounds good to me!

To be fair, that is unrelated effort 🙂 just that we wanted it for several things in the past.

In this case, all 3 options have pros/cons. Helpers will inline control-flow for all DIV operations early and we won't be able to hoist it/etc.

@jkotas
Copy link
Member

jkotas commented Nov 30, 2024

Helpers will inline control-flow for all DIV operations early and we won't be able to hoist it/etc.

I do not see it as a problem for the DIV/MOD helpers on 32-bit platforms. These operations have side-effects (throw exceptions). I would expect that chances of them being hoisted in real-world code are fairly low.

It may not be desirable for other types of helpers.

@am11
Copy link
Member Author

am11 commented Dec 2, 2024

  1. Emit conditions in some crossplatform manner in JIT, e.g. similar to static-ctor-expansion (involves even more efforts actually)

@EgorBo, I think this is the simplest option. JIT is already injecting DivByZero and Overflow exceptions in some cases (CORINFO_HELP_THROWDIVZERO, CORINFO_HELP_OVERFLOW). After the checks it can call the existing CORINFO_HELP_{,U,L,UL}{DIV,MOD} and remove (now-redundant) FCThrow() from jithelpers.cpp; keeping only the a/b a%b lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm32 arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

4 participants