-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: initial profile repair #100456
JIT: initial profile repair #100456
Conversation
Always run a repair pass after incorporating profile data. If that fails to converge, blend in synthetic likelihoods in increasing strengths to enable convergence (giving up some accuracy). Enable block weight consistency checks to run right after profile incorporation. Disable just after that.
@amanasifkhalid FYI This may be a little too drastic.... |
This is the same method that inspired #100449. |
This shows we can have initially consistent profiles (with one exception) but perhaps more TP or CQ/CS impact than is necessary. Because this method perturbs the profile there are several aspects to the TP impact: the cost of the computation, the effects of that profile on the rest of the JIT, and the possibility of additional missed contexts. I will work up a version that does all the computation but then doesn't actually modify the profile, so I can get a better read on the first part. If the computation is relatively cheap it may make sense to up the solver iteration limit and/or slow down the blend factor growth "schedule" so the repairs are less impactful overall (in other words: perhaps by doing more computation early we can end up with less TP/CS/CQ churn). Note we only expect to have to iterate if there are one or more "rare" factors:
So for the vast majority of methods we should do one pass through the code and be done. That's something I also should measure.... |
Out of curiosity, are you planning on using the SPMI results to determine if the tradeoff between profile accuracy and profile consistency is worth it here? You mentioned you plan to try to reduce churn, so I might be premature in looking at the diffs, but I see some of the largest size changes are due to loop cloning, so those methods that regressed might be perf wins (and the size improvements might not be ideal?). As for the large number of small regressions that seem to be from block ordering, it doesn't seem easy to tell if splitting up some fall-through is an improvement because it allowed us to move a hot block up, or a regression because the newly-added jump isn't cold enough to pay for itself (though maybe spending a nontrivial amount of time on these diffs isn't worth it if we're going to change the block layout algorithm soon). |
…taining improper loops
Yes, to some extent... I think it might make sense to insist on initial consistency for now and later when we see how quickly consistency degrades (or is maintained) we might redo things. Maybe. |
Synthesis and reconstruction assume exceptions are rare, so if the actual profile shows significant flow into a throw that can be caught, we won't model this properly. The net result is that some profile weight will vanish, throwing off the entry/exit balance. Tolerate this for now, by watching for cachable throws, and disabling the entry/exit balance checks if any are seen. If it ever turns out that serious code has high exception frequency we can reconsider and try and model flow through catches.
/azp run runtime-coreclr libraries-pgo |
Azure Pipelines successfully started running 1 pipeline(s). |
TP diffs are still kind of high, suspect this is from knock-on effects where the new profile leads to inlining or cloning changes. Number of missed contexts is also up considerably, so the "tp win" on asp.net is likely misleading too. Going to do some local runs where I just run the solver but don't update the counts, to see what that costs. |
Can't repro the arm64 failure so far. Getting an isolated TP solver measurement seems tricky... still working on it. |
Upping iteration limit, hoping things converge on one of the earlier passes, so we do less blending overall, and so have something closer to the initial profile, so less TP impact and less churn. Seems like the solver is (relatively) cheap, though not sure I trust my local TP data. Warp heuristic return likelihoods upwards each retry. |
Hmm, looks like a step in the wrong direction, which is odd... need to dig in. |
Unless there is a catchable throw.::
Looks like many of the new failures were entry/exit balance checks: that is we try and ensure that the total BBJ_RETURN and select BBJ_THROW weight matches the entry weight. Even if each node locally converges to 0.01 tolerance, a chain of these small mismatches can add up to a greater mismatch. So we now check for the entry/exit residual during solving and keep iterating if it is too high. |
Idea that more initial iterations will improve TP and lead to less churn is not holding up so far... |
/azp run runtime-coreclr libraries-pgo |
Azure Pipelines successfully started running 1 pipeline(s). |
Looks better, correctness wise... just one issue left perhaps for cases where the intitial profile is synthetic. TP/CQ still not where I'd like them to be, but I'm not sure they can be improved much. |
/azp run runtime-coreclr jitstress, runtime-coreclr pgostress, runtime-coreclr pgo |
Azure Pipelines successfully started running 3 pipeline(s). |
/azp run runtime-coreclr libraries-pgo, runtime-coreclr jitstress, runtime-coreclr pgostress, runtime-coreclr pgo |
Azure Pipelines successfully started running 4 pipeline(s). |
/azp run runtime-coreclr libraries-pgo, runtime-coreclr jitstress, runtime-coreclr pgostress, runtime-coreclr pgo |
Azure Pipelines successfully started running 4 pipeline(s). |
@amanasifkhalid PTAL Will have some TP impact and cause some CQ churn. Hoping to get most of the TP back if we can reign in cloning a bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, assuming you get the results you want in CI.
Thanks for all the iterative work on this!
/azp run runtime-coreclr libraries-pgo, runtime-coreclr jitstress, runtime-coreclr pgostress, runtime-coreclr pgo |
Azure Pipelines successfully started running 4 pipeline(s). |
The extra legs may not be 100% clean, though there was just one (related) failure in my last go round, so it's possible. |
libraries-pgo has a few crashes. I don't think they are directly related to this change, though perhaps it is making some existing issue more prominent. I will investigate them separately. There has been a low background rate of these things for a while now. |
Always run a repair pass after incorporating profile data. If that fails to converge, blend in synthetic likelihoods in increasing strengths to enable convergence (giving up some accuracy). Enable block weight consistency checks to run right after profile incorporation. Disable just after that. We can generally make profile data self-consistent, save for two exceptions: 1. if the flow graph contains an infinite (or effectively infinite) loop 2. if the IL is invalid Because of (2) we can't immediately assert if the profile is not consistent; we must defer until after we've successfully imported the method (since the importer contains many of the validity checks).
Always run a repair pass after incorporating profile data. If that fails to converge, blend in synthetic likelihoods in increasing strengths to enable convergence (giving up some accuracy).
Enable block weight consistency checks to run right after profile incorporation. Disable just after that.
We can generally make profile data self-consistent, save for two exceptions:
Because of (2) we can't immediately assert if the profile is not consistent; we must defer until after we've successfully imported the method (since the importer contains many of the validity checks).
Diffs