-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: profile updates for return merges and tail calls #48773
Conversation
Stop trying to update the common return block profile data during return merging, as it is not yet known which return blocks will become tail calls. Start updating constant return block profile data during return merging as this is when we transform the flow. Update the common return block profile data during return merging in morph (adding more countss) and when creating tail calls (removing counts). Update profile consistency checker to handle switches properly and to use tolerant compares. Add extra dumping when solving for edge weights or adjusting flow edge weights to help track down where errors are coming from. Add new FMT_WT formatting string for profile weights, and start using it in fgprofile. handle constant return merges too
@briansull ptal On my local Tiered-PGO SPMI collection this reduces the number of consistency check failures from 247 to 182 (out of ~2600 compiles). Will post some info on diffs shortly. |
Update FMT_WT to %g so we don't see huge digit strings.
Second commit fixed some spurious diffs w/o pgo. Only see a handful of diffs now in regular collections, all in roslyn:
My local TieredPGO collection has more diffs, but still not extensive...
Most common diff is a return block being placed earlier in the flow as it now has a more representative profile weight. |
Arm64 failure is a CI issue:
|
@@ -2628,6 +2671,9 @@ void flowList::setEdgeWeights(BasicBlock::weight_t theMinWeight, BasicBlock::wei | |||
{ | |||
assert(theMinWeight <= theMaxWeight); | |||
|
|||
JITDUMP("Setting edge weights for BB?? -> " FMT_BB " to [" FMT_WT " .. " FMT_WT "]\n", getBlock()->bbNum, | |||
theMinWeight, theMaxWeight); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is backwards; I'll fix in a subsequent PR (and pass in the actual dest block to remove the BB??
), if that's ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe just merge in my next round of changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with latest commit.
Also fix dump output from `setEdgeWeights` and pass in the destination of the edge.
Latest commit fixed another 20 or so checker failures, so number of cases failing with profile checks enabled and asserting is now down to 159. Getting to zero is not realistic as there are some things we won't be able to get right or even get consistent for a while. One example is inlining an unprofiled method with flow into a profiled one. |
Similar Arm64 CI issue as before:
|
Breakdown of when we detect inconsistencies in the remaining 159 failures:
The "other" entry is an assert I need to address here. |
… have their counts updated.
Failures seen above are apparently tracked by #48820 |
@dotnet/jit-contrib anyone up for a review...? |
src/coreclr/jit/morph.cpp
Outdated
|
||
nextBlock->setBBProfileWeight(newNextWeight); | ||
|
||
if (newNextWeight == 0.0f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should all these 0.0f
constants be some ZERO_WEIGHT
define?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the relevant define is BB_ZERO_WEIGHT
. A quick regex suggests it's almost exactly as popular as the 0
literal when used for comparisons with bbWeight
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat ambivalent about the value of having BB_ZERO_WEIGHT
and BB_UNITY_WEIGHT
since going forward we're quite unlikely to ever change them from 0.0f and 1.0f.
But I suppose we should be consistent and perhaps the symbolic form calls more attention to itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed what I could easily find over to BB_ZERO_WEIGHT
.
// converting this tail call to a branch, we'll add appropriate | ||
// successor information then. | ||
fgRemoveBlockAsPred(compCurBB); | ||
// If this block has a flow successor, make suitable updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if all this new code (or some of it?) should be extracted to a fgUpdateTailcallProfileWeights
or similar?
In fact, I wonder if most/all of the profile weight maintenance should be extracted into functions in fgprofile.cpp instead of spread around the various bits of the code base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think it belongs with the transformation; this is an important part of the transformation and not some additional / optional side work.
In the (hopefully not too distant) future we'll always have profile weights on every block so some of the conditional guarding will go way.
Thanks, Bruce. I think I'm also going to fold the "run rarely" logic into |
Interop IEnumeratorTest on windows arm64 is failing with result -1073740791 aka 0xC0000409 -- stack overflow. Seems quite likely unrelated to my change.... |
Stop trying to update the common return block profile data during return
merging, as it is not yet known which return blocks will become tail calls.
Start updating constant return block profile data during return merging
as this is when we transform the flow.
Update the common return block profile data during return merging in
morph (adding more counts) and when creating tail calls (removing counts).
Update profile consistency checker to handle switches properly and to use
tolerant compares.
Add extra dumping when solving for edge weights or adjusting flow edge
weights to help track down where errors are coming from.
Add new
FMT_WT
formatting string for profile weights, and start using itin
fgprofile.cpp
.