-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change how MIR inlining handles cycles #43542
Comments
cc @matthewhammer, who may take this on |
What I came up with previously was not caching the result of queries which observed a cycle via |
Above, @nikomatsakis outlines two problems (the way that I view it now, at least)
I've talked to @nikomatsakis a couple of times about this since he created this issue. In my view, the first problem and the second problem can be decoupled, where we solve the first problem without necessarily eschewing the query framework's cycle detection mechanism. (Either way would probably work, I think, and I don't have strong feelings about the choice). In particular, I recently proposed a solution to the first problem that would require knowing some SCC information, and would use that to determine inline decisions at callsites, in place of (solely) using the cycle detection mechanism. Restating the problemI'd rephrase the first problem above as follows: The current inlining logic is "asymmetric", in that as it makes inlining decisions, it favors one root for each SCC, rather than favoring each SCC root equally. My proposal for inlining assumes that we want to specialize each SCC of "inline" nodes for each root of this SCC. This may or may not be what we actually want, since it will lead to lots of code specialization (maybe good, maybe not). Tiny exampleTo illustrate the idea, suppose we have an SCC with two inline functions, The current approach has the issue that either I'd also offer the term "asymmetric", in the sense that every root of the SCC should get its own "view" of the SCC, rather than choosing a single root (as determined by the larger compilation context, with its various queries and their dependencies). Specialize MIR for each SCC rootMy proposal addresses this asymmetry issue by assuming that we want to specialize each SCC of "inline" nodes for each root of this SCC. In our running example, we'd generate two versions of the single SCC involving And in general, the per-SCC-root approach is as follows: For each SCC, for each root, we generate one inlined MIR; so during inlining any MIR, have a root function that gives the call edges of the SCC a "direction" (e.g., via a topological sort from this root), allowing us to distinguish the “backward” call edges which are not inlined from the (inlined) “forward” call edges. In general, for an SCC of Include SCC root among query keyMy current thinking is that the inlining approach above (the "per-SCC-root approach") can be implemented in two different ways. In both ways, we key the queries for inlined output by at least two DefIDs:
That is to say, we compute inlined output in reference to a "root" inline function of the inlined code. This additional root parameter addresses the asymmetry issue, as explained above in the example. In sum, for each root we get a different view of the SCC. Given this definition for the (rooted) queries, I see two implementation strategies:
|
My main questions for @nikomatsakis and others:
|
This is an interesting strategy. It certainly avoids the non-determinism that is the main problem. I don't know if it's a good optimization strategy, though. If inlining potentially duplicates the functions in an SCC once per "root", then that should probably be reflected in the cost metric for inlining. However, I don't see how this can be done in the "go depth-first, use cycle recovery, but include the root in the query key" implementation (since the SCC is never explicitly computed). Furthermore, even if the SCC is available, it seems somewhat tricky to predict what inlining decisions other, separate runs of the inlining pass (starting from a different root) will perform. (This information would presumably be needed for the cost metric.) Of course, each query could run the same analyses that those other invocations would already perform, but that would duplicate a ton of work. Perhaps this is just my lack of imagination speaking, but I fear that going through SCC multiple times, from different starting points, will just make it harder to make good inlining decisions. (Besides, if the SCC is available, there's no particular need to innovate on this. There are other strategies for dealing with SCCs that are more battle-tested.) |
rustc: Start moving toward "try_get is a bug" for incremental This PR is an effort to burn down some of the work items on #42633. The basic change here was to leave the `try_get` function exposed but have it return a `DiagnosticBuilder` instead of a `CycleError`. This means that it should be a compiler bug to *not* handle the error as dropping a diagnostic should result in a complier panic. After that change it was then necessary to update the compiler's callsites of `try_get` to handle the error coming out. These were handled as: * The `sized_constraint` and `needs_drop_raw` checks take the diagnostic and defer it as a compiler bug. This was a new piece of functionality added to the error handling infrastructure, and the idea is that for both these checks a "real" compiler error should be emitted elsewhere, so it's only a bug if we don't actually emit the complier error elsewhere. * MIR inlining was updated to just ignore the diagnostic. This is being tracked by #43542 which sounded like it either already had some work underway or was planning to change regardless. * The final case, `item_path`, is still sort of up for debate. At the time of this writing this PR simply removes the invocations of `try_get` there, assuming that the query will always succeed. This turns out to be true for the test suite anyway! It sounds like, though, that this logic was intended to assist in "weird" situations like `RUST_LOG` where debug implementations can trigger at any time. This PR would therefore, however, break those implementations. I'm unfortunately sort of out of ideas on how to handle `item_path`, but other thoughts would be welcome! Closes #42633
The whole goal is to move away from that strategy -- it is nondeterministic, and cycle recovery is problematic anyhow (we would like the invariant that "cycle detected => compile fails eventually"). My current plan is to do the following:
I think that @matthewhammer's scheme fits into this by adding, essentially, another layering of inlining that happens afterwards. That is, what I described above yields a result that is effectively the sort of "bottom-up" inlining that we are trying to do now. If MIR0 is the pre-inlining representation (from which the SCC is computed), then we have now built MIR1. We can then imagine inlining within an SCC -- that would be inlining MIR1 into MIR1, to yield MIR2.
This, however, may well be true! |
@nikomatsakis re: cycle recovery, I was referring to this part (one of two possible implementation strategies) of @matthewhammer's post:
This would be deterministic, but would still use cycle recovery (in a bening way). There might be other reasons to remove cycle recovery entirely, then this implementation strategy is undesirable anyway. I was just saying, even if this implementation strategy remains viable, it may not give very good results.
Aside to be clear: it is absolutely possible (and profitable) to inline within an SCC. LLVM does this sometimes. It may not be as easy (or at all possible) to achieve with the "query optimized MIR for other functions" model, and it may not be necessary or very useful for MIR optimizations, but in theory it's possible. |
Note that PR #68828 resolved the issue of inlining cycles in a more "pessimistic" way, as noted to me by @wesleywiser |
I do not think this is a tracking issue. It does track a potential work item, but I'm not convinced its broken down into subparts that warrant the C-tracking-issue label. @rustbot label: -C-tracking-issue |
After discussing with @wesleywiser , we're not sure there's value in leaving this open. If someone sees this and realizes its not done and wants to pick it up and do it, great. But its mostly just noise in our issue repository the way things stand now. |
When doing the "query-ification" of MIR, we handled MIR inlining by doing a "try-get", meaning that we attempt to inline all callees but ignore if a cycle arises. This is OK but leads to nondeterministic results where the order of queries in a SCC can affect what gets optimized into what. On #42633, I wrote:
And @michaelwoerister later wrote:
This issue is intending to track the proposed refactoring of the MIR inlining pass to not use try-get.
The text was updated successfully, but these errors were encountered: