Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Use successor edges instead of block targets for BBJ_EHFINALLYRET #98563

Merged
merged 6 commits into from
Feb 19, 2024

Conversation

amanasifkhalid
Copy link
Member

Part of #93020. This is a first step in a broader effort to replace BasicBlock's jump targets (bbTarget, bbFalseTarget, etc.) with successor flow edges. This PR also adds overloaded implementations for commonly used edge maintenance methods (fgReplacePred, fgRemoveRefPred, etc.) that can take advantage of the cheap access to successor edges (i.e. without calling fgGetPredForBlock -- hopefully we'll be able to get rid of that at some point).

@ghost ghost assigned amanasifkhalid Feb 16, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 16, 2024
@ghost
Copy link

ghost commented Feb 16, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Part of #93020. This is a first step in a broader effort to replace BasicBlock's jump targets (bbTarget, bbFalseTarget, etc.) with successor flow edges. This PR also adds overloaded implementations for commonly used edge maintenance methods (fgReplacePred, fgRemoveRefPred, etc.) that can take advantage of the cheap access to successor edges (i.e. without calling fgGetPredForBlock -- hopefully we'll be able to get rid of that at some point).

Author: amanasifkhalid
Assignees: amanasifkhalid
Labels:

area-CodeGen-coreclr

Milestone: -

@ryujit-bot
Copy link

Diff results for #98563

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Overall (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch +0.03%
benchmarks.run_pgo.linux.arm64.checked.mch +0.03%
benchmarks.run_tiered.linux.arm64.checked.mch +0.03%
coreclr_tests.run.linux.arm64.checked.mch +0.02%
libraries.crossgen2.linux.arm64.checked.mch +0.03%
libraries.pmi.linux.arm64.checked.mch +0.03%
libraries_tests.run.linux.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch +0.03%
realworld.run.linux.arm64.checked.mch +0.03%
smoke_tests.nativeaot.linux.arm64.checked.mch +0.03%
MinOpts (+0.02% to +0.06%)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch +0.03%
benchmarks.run_pgo.linux.arm64.checked.mch +0.03%
benchmarks.run_tiered.linux.arm64.checked.mch +0.03%
coreclr_tests.run.linux.arm64.checked.mch +0.02%
libraries.crossgen2.linux.arm64.checked.mch +0.04%
libraries.pmi.linux.arm64.checked.mch +0.03%
libraries_tests.run.linux.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch +0.03%
realworld.run.linux.arm64.checked.mch +0.06%
smoke_tests.nativeaot.linux.arm64.checked.mch +0.03%
FullOpts (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch +0.03%
benchmarks.run_pgo.linux.arm64.checked.mch +0.03%
benchmarks.run_tiered.linux.arm64.checked.mch +0.03%
coreclr_tests.run.linux.arm64.checked.mch +0.03%
libraries.crossgen2.linux.arm64.checked.mch +0.03%
libraries.pmi.linux.arm64.checked.mch +0.03%
libraries_tests.run.linux.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch +0.03%
realworld.run.linux.arm64.checked.mch +0.03%
smoke_tests.nativeaot.linux.arm64.checked.mch +0.03%

Throughput diffs for linux/x64 ran on windows/x64

Overall (+0.03% to +0.04%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch +0.03%
benchmarks.run_pgo.linux.x64.checked.mch +0.03%
benchmarks.run_tiered.linux.x64.checked.mch +0.03%
coreclr_tests.run.linux.x64.checked.mch +0.03%
libraries.crossgen2.linux.x64.checked.mch +0.03%
libraries.pmi.linux.x64.checked.mch +0.03%
libraries_tests.run.linux.x64.Release.mch +0.03%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch +0.03%
realworld.run.linux.x64.checked.mch +0.03%
smoke_tests.nativeaot.linux.x64.checked.mch +0.04%
MinOpts (+0.02% to +0.09%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch +0.03%
benchmarks.run_pgo.linux.x64.checked.mch +0.03%
benchmarks.run_tiered.linux.x64.checked.mch +0.03%
coreclr_tests.run.linux.x64.checked.mch +0.02%
libraries.crossgen2.linux.x64.checked.mch +0.05%
libraries.pmi.linux.x64.checked.mch +0.04%
libraries_tests.run.linux.x64.Release.mch +0.03%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch +0.03%
realworld.run.linux.x64.checked.mch +0.09%
smoke_tests.nativeaot.linux.x64.checked.mch +0.04%
FullOpts (+0.03% to +0.04%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch +0.03%
benchmarks.run_pgo.linux.x64.checked.mch +0.04%
benchmarks.run_tiered.linux.x64.checked.mch +0.04%
coreclr_tests.run.linux.x64.checked.mch +0.03%
libraries.crossgen2.linux.x64.checked.mch +0.03%
libraries.pmi.linux.x64.checked.mch +0.03%
libraries_tests.run.linux.x64.Release.mch +0.03%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch +0.03%
realworld.run.linux.x64.checked.mch +0.03%
smoke_tests.nativeaot.linux.x64.checked.mch +0.04%

Throughput diffs for osx/arm64 ran on windows/x64

Overall (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.osx.arm64.checked.mch +0.03%
benchmarks.run_pgo.osx.arm64.checked.mch +0.03%
benchmarks.run_tiered.osx.arm64.checked.mch +0.03%
coreclr_tests.run.osx.arm64.checked.mch +0.02%
libraries.crossgen2.osx.arm64.checked.mch +0.03%
libraries.pmi.osx.arm64.checked.mch +0.03%
libraries_tests.run.osx.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch +0.03%
realworld.run.osx.arm64.checked.mch +0.03%
MinOpts (+0.02% to +0.07%)
Collection PDIFF
benchmarks.run.osx.arm64.checked.mch +0.03%
benchmarks.run_pgo.osx.arm64.checked.mch +0.03%
benchmarks.run_tiered.osx.arm64.checked.mch +0.03%
coreclr_tests.run.osx.arm64.checked.mch +0.02%
libraries.crossgen2.osx.arm64.checked.mch +0.04%
libraries.pmi.osx.arm64.checked.mch +0.03%
libraries_tests.run.osx.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch +0.03%
realworld.run.osx.arm64.checked.mch +0.07%
FullOpts (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.osx.arm64.checked.mch +0.03%
benchmarks.run_pgo.osx.arm64.checked.mch +0.03%
benchmarks.run_tiered.osx.arm64.checked.mch +0.03%
coreclr_tests.run.osx.arm64.checked.mch +0.03%
libraries.crossgen2.osx.arm64.checked.mch +0.03%
libraries.pmi.osx.arm64.checked.mch +0.03%
libraries_tests.run.osx.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch +0.03%
realworld.run.osx.arm64.checked.mch +0.03%

Throughput diffs for windows/arm64 ran on windows/x64

Overall (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch +0.03%
benchmarks.run_pgo.windows.arm64.checked.mch +0.03%
benchmarks.run_tiered.windows.arm64.checked.mch +0.03%
coreclr_tests.run.windows.arm64.checked.mch +0.02%
libraries.crossgen2.windows.arm64.checked.mch +0.03%
libraries.pmi.windows.arm64.checked.mch +0.03%
libraries_tests.run.windows.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch +0.03%
realworld.run.windows.arm64.checked.mch +0.03%
smoke_tests.nativeaot.windows.arm64.checked.mch +0.03%
MinOpts (+0.02% to +0.08%)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch +0.03%
benchmarks.run_pgo.windows.arm64.checked.mch +0.03%
benchmarks.run_tiered.windows.arm64.checked.mch +0.03%
coreclr_tests.run.windows.arm64.checked.mch +0.02%
libraries.crossgen2.windows.arm64.checked.mch +0.04%
libraries.pmi.windows.arm64.checked.mch +0.03%
libraries_tests.run.windows.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch +0.03%
realworld.run.windows.arm64.checked.mch +0.08%
smoke_tests.nativeaot.windows.arm64.checked.mch +0.03%
FullOpts (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch +0.03%
benchmarks.run_pgo.windows.arm64.checked.mch +0.03%
benchmarks.run_tiered.windows.arm64.checked.mch +0.03%
coreclr_tests.run.windows.arm64.checked.mch +0.02%
libraries.crossgen2.windows.arm64.checked.mch +0.03%
libraries.pmi.windows.arm64.checked.mch +0.03%
libraries_tests.run.windows.arm64.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch +0.03%
realworld.run.windows.arm64.checked.mch +0.03%
smoke_tests.nativeaot.windows.arm64.checked.mch +0.03%

Throughput diffs for windows/x64 ran on windows/x64

Overall (+0.03% to +0.04%)
Collection PDIFF
benchmarks.run.windows.x64.checked.mch +0.03%
benchmarks.run_pgo.windows.x64.checked.mch +0.03%
benchmarks.run_tiered.windows.x64.checked.mch +0.04%
coreclr_tests.run.windows.x64.checked.mch +0.03%
libraries.crossgen2.windows.x64.checked.mch +0.03%
libraries.pmi.windows.x64.checked.mch +0.03%
libraries_tests.run.windows.x64.Release.mch +0.03%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch +0.03%
realworld.run.windows.x64.checked.mch +0.03%
smoke_tests.nativeaot.windows.x64.checked.mch +0.04%
MinOpts (+0.02% to +0.09%)
Collection PDIFF
benchmarks.run.windows.x64.checked.mch +0.04%
benchmarks.run_pgo.windows.x64.checked.mch +0.04%
benchmarks.run_tiered.windows.x64.checked.mch +0.04%
coreclr_tests.run.windows.x64.checked.mch +0.02%
libraries.crossgen2.windows.x64.checked.mch +0.05%
libraries.pmi.windows.x64.checked.mch +0.04%
libraries_tests.run.windows.x64.Release.mch +0.03%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch +0.04%
realworld.run.windows.x64.checked.mch +0.09%
smoke_tests.nativeaot.windows.x64.checked.mch +0.04%
FullOpts (+0.03% to +0.04%)
Collection PDIFF
benchmarks.run.windows.x64.checked.mch +0.03%
benchmarks.run_pgo.windows.x64.checked.mch +0.03%
benchmarks.run_tiered.windows.x64.checked.mch +0.04%
coreclr_tests.run.windows.x64.checked.mch +0.03%
libraries.crossgen2.windows.x64.checked.mch +0.03%
libraries.pmi.windows.x64.checked.mch +0.03%
libraries_tests.run.windows.x64.Release.mch +0.03%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch +0.03%
realworld.run.windows.x64.checked.mch +0.03%
smoke_tests.nativeaot.windows.x64.checked.mch +0.04%

Details here


Throughput diffs for windows/x86 ran on linux/x86

Overall (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch +0.03%
benchmarks.run_pgo.windows.x86.checked.mch +0.03%
benchmarks.run_tiered.windows.x86.checked.mch +0.03%
coreclr_tests.run.windows.x86.checked.mch +0.02%
libraries.crossgen2.windows.x86.checked.mch +0.02%
libraries.pmi.windows.x86.checked.mch +0.03%
libraries_tests.run.windows.x86.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch +0.03%
realworld.run.windows.x86.checked.mch +0.02%
MinOpts (+0.02% to +0.09%)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch +0.03%
benchmarks.run_pgo.windows.x86.checked.mch +0.03%
benchmarks.run_tiered.windows.x86.checked.mch +0.03%
coreclr_tests.run.windows.x86.checked.mch +0.02%
libraries.crossgen2.windows.x86.checked.mch +0.03%
libraries.pmi.windows.x86.checked.mch +0.04%
libraries_tests.run.windows.x86.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch +0.03%
realworld.run.windows.x86.checked.mch +0.09%
FullOpts (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch +0.03%
benchmarks.run_pgo.windows.x86.checked.mch +0.03%
benchmarks.run_tiered.windows.x86.checked.mch +0.03%
coreclr_tests.run.windows.x86.checked.mch +0.02%
libraries.crossgen2.windows.x86.checked.mch +0.02%
libraries.pmi.windows.x86.checked.mch +0.03%
libraries_tests.run.windows.x86.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch +0.03%
realworld.run.windows.x86.checked.mch +0.02%

Details here


@amanasifkhalid
Copy link
Member Author

amanasifkhalid commented Feb 16, 2024

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. No diffs, though there is some TP impact -- BBJ_EHFINALLYRET isn't the most common block kind, so I'm guessing most of this cost is coming from the changes to the block successor iterators.

CI failure is #98500.

@ryujit-bot
Copy link

Diff results for #98563

Throughput diffs

Throughput diffs for linux/arm ran on windows/x86

Overall (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch +0.02%
benchmarks.run_pgo.linux.arm.checked.mch +0.03%
benchmarks.run_tiered.linux.arm.checked.mch +0.02%
coreclr_tests.run.linux.arm.checked.mch +0.02%
libraries.crossgen2.linux.arm.checked.mch +0.02%
libraries.pmi.linux.arm.checked.mch +0.02%
libraries_tests.run.linux.arm.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch +0.03%
realworld.run.linux.arm.checked.mch +0.03%
MinOpts (+0.02% to +0.06%)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch +0.02%
benchmarks.run_pgo.linux.arm.checked.mch +0.02%
benchmarks.run_tiered.linux.arm.checked.mch +0.02%
coreclr_tests.run.linux.arm.checked.mch +0.02%
libraries.crossgen2.linux.arm.checked.mch +0.03%
libraries.pmi.linux.arm.checked.mch +0.03%
libraries_tests.run.linux.arm.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch +0.02%
realworld.run.linux.arm.checked.mch +0.06%
FullOpts (+0.02% to +0.03%)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch +0.02%
benchmarks.run_pgo.linux.arm.checked.mch +0.03%
benchmarks.run_tiered.linux.arm.checked.mch +0.02%
coreclr_tests.run.linux.arm.checked.mch +0.02%
libraries.crossgen2.linux.arm.checked.mch +0.02%
libraries.pmi.linux.arm.checked.mch +0.02%
libraries_tests.run.linux.arm.Release.mch +0.02%
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch +0.03%
realworld.run.linux.arm.checked.mch +0.03%

Details here


Throughput diffs for linux/arm64 ran on linux/x64

MinOpts (+0.00% to +0.01%)
Collection PDIFF
realworld.run.linux.arm64.checked.mch +0.01%

Throughput diffs for linux/x64 ran on linux/x64

MinOpts (+0.00% to +0.01%)
Collection PDIFF
realworld.run.linux.x64.checked.mch +0.01%

Details here


Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right direction to go in. Will be interesting to see what complications ensue when you get to more "popular" block kinds.

bool found = false;

// Walk the successor table looking for the specified successor block.
// Search succTab for succEdge so we can splice it out of the table.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know how frequently we remove a successor like this?

If it happens somewhat frequently, I wonder if we'd be better off allowing there to be voids in the array and just skipping over them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how often we encounter this, though I imagine this bit of logic isn't cheap TP-wise. There are quite a few places where we, for example, convert a BBJ_ALWAYS to a BBJ_COND (and vice versa), and we have to remove the current successor edge(s) from the successor block's pred list (and then add in new edges to the pred list -- it would be nice to see just how much #93636 helps with this). I can take a look at this locally and see how helpful it would be to skip this maintenance.

//
BasicBlock* succBlock = edge->getDestinationBlock();
assert(succBlock != nullptr);
succBlock->ensurePredListOrder(this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of pred list order, we might want to revisit #93636 sometime.

@amanasifkhalid
Copy link
Member Author

CI failure is #98500.

@amanasifkhalid amanasifkhalid merged commit 5e1caf1 into dotnet:main Feb 19, 2024
127 of 129 checks passed
@amanasifkhalid amanasifkhalid deleted the ehf-succ branch February 19, 2024 15:28
@github-actions github-actions bot locked and limited conversation to collaborators Mar 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants