-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document wf constraints on control flow in cleanup blocks #106612
Conversation
This PR changes MIR cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @celinval, @vakaras |
Advice on how to phrase an "I don't know graph theory"-readable description of this constraint much appreciated. It's unfortunately not exactly intuitive, but I also don't know a simpler version that is still within the bounds of Mir we generate and Mir codegen ICEs on |
For completeness, the MIR opt that can run afoul of this is #106613 (though it's definitely an optimization opportunity that this optimization opens up that is the direct problem) |
Can you give an example? It would be nice to add a test case that fails validation once custom MIR supports cleanup blocks. Implementations check that each vertex has at most one successor in the resulting graph. What rules out cycles? |
Consider this cfg: Everything besides a is a cleanup block. If our reverse postorder is abed we make three funclets and everything is fine. If our reverse postorder is instead adeb, then when processing the b->e edge, we'll attempt to make a new funclet at e, which will cause an ICE when it adds a second successor to d.
Oops. Will fix |
Thanks for explanation, I see now what you mean regarding iteration order. As far as I understand, with caveat that I am not really familiar with Windows specific parts of exception handling code generation, in the example resulting nesting is invalid, since it doesn't satisfy "the funclet pads’ unwind destinations cannot form a cycle" condition. The existing checks are simply incomplete. |
Yeah, I don't think these were really intended to be "checks" in the first place, just "ICEing is the most reasonable thing to do in this code path." |
@rustbot author as well, for the cycle fix |
@tmiasko can you take over the review? I'm a bit lost here... |
/// 4. The induced subgraph on cleanup blocks must look roughly like an upside down tree. This is | ||
/// necessary to ensure that landing pad information can be correctly codegened. More precisely: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the "subgraph on cleanup blocks"? Is it the graph where you remove all non-cleanup blocks and only keep edges between cleanup blocks?
What exactly is forbidden here? Is it essentially any kind of merging control flow? As in, every cleanup block must only have one cleanup block predecessor -- is that the condition? The analysis you implemented seems a lot more complicated than that, and I am lost in all the domination talk...
Or does the "upside down" mean that every cleanup block most only have one cleanup block successor? Branching is upside down merging. Though why branching control flow would be an issue is beyond me.
But anyway the usual definition of a (directed) forest is that each node has at most one predecessor and there are no cycles, so that might be a less graph-theoretic way of saying it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the "subgraph on cleanup blocks"? Is it the graph where you remove all non-cleanup blocks and only keep edges between cleanup blocks?
Yes. I'll spell this out. I also realize this should have said "subgraph on reachable cleanup blocks" so will fix that as well.
Or does the "upside down" mean that every cleanup block most only have one cleanup block successor?
This is the idea, but only after contracting dominators. One of the things that makes this constraint hard to understand is that there are no local structures that are disallowed in the graph. Indeed, if the cleanup blocks have just one entrance point, then there are no restrictions at all (since that entrance point will necessarily dominate all other reachable cleanup blocks). I don't think we can actually strengthen the restriction much either, since drop elab actually generates cycles (for dropping arrays) and branches (for drop flags) .
Instead this constraint is about how control flow can merge "between entrance points to cleanup code." One simple example of a graph that is disallowed is the one above, another example is a W shaped graph (all edges pointing down).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was hoping I could avoid thinking about dominators. ;) I never delved deeply enough into compilers to get solid intuition for them... In this case, do you compute dominators before
So, an example of something we cannot have is a control flow split where one of the successors is also reachable from another independent cleanup path? But a "V", where two independent cleanup paths merge, is okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative definition that doesn't use dominators.
It must be possible to partition reachable cleanup blocks into groups such that:
- Each group has a unique entry block. Edges from outside into the group must target the group's entry block.
- Edges from within a particular group to outside must target the same block.
- Edges in-between groups cannot form a cycle.
- There are no restriction on edges within the group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never delved deeply enough into compilers to get solid intuition for them... In this case, do you compute dominators before
End of the sentence may have gotten lost there? Otherwise you'll have to expand a bit
So, an example of something we cannot have is a control flow split where one of the successors is also reachable from another independent cleanup path? But a "V", where two independent cleanup paths merge, is okay?
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edges from outside into the group must target the group's entry block.
Obviously a nit but note that we have to restrict to edges from reachable blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
End of the sentence may have gotten lost there? Otherwise you'll have to expand a bit
Sorry... got interrupted and then forgot to finish the sentence.^^
I was about to ask whether dominance is computed before or after restricting the graph to cleanup blocks. The comment should clarify that.
For me personally this restriction is so arbitrary and unmotivated that the examples would also be helpful, but it is very possible that this makes more sense to others. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to ask whether dominance is computed before or after restricting the graph to cleanup blocks. The comment should clarify that.
Dominance is traditionally only defined for a control flow graph (ie a directed graph with a "start" vertex), so it doesn't really make sense to compute it after restricting to cleanup blocks, which typically do not form a control flow graph. That being said, there is a fairly natural extension of the definition (in which we consider a set of start vertices, instead of just one). In that case it doesn't matter which we pick. I'll clarify anyway.
For me personally this restriction is so arbitrary and unmotivated that the examples would also be helpful, but it is very possible that this makes more sense to others. :)
Yeah, I don't really understand it either, but I also don't feel like going and reading the Itanium ABI spec to find out...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constraints are described in https://llvm.org/docs/ExceptionHandling.html#funclet-transitions. Their relation to MIR can only be understood in the context of code generation for MSVC target (other targets don't use funclets).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I thought Itanium did as well. Interesting
7d57491
to
54a5c42
Compare
This comment has been minimized.
This comment has been minimized.
54a5c42
to
7057f83
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 7057f835cfc7bf395bef792e116575864578123a with merge 027f53d68204819904d833b75f4a81a74686fd0a... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
r? @tmiasko |
7057f83
to
4bc963e
Compare
Maybe regular crater runs under |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 4bc963e with merge 8e42a53653740a9bdf11fbf5f1c3d28e4d2e6d01... |
The |
From the current appearance of codegen, it seems to be unconditionally used. I don't really understand this code well enough to be sure though |
Yes, but for
|
Ah, ok. Yeah, that seems reasonable, should probably be done separately from this PR though |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (8e42a53653740a9bdf11fbf5f1c3d28e4d2e6d01): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (f34cc65): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. |
Was recently made aware of this code, which has this potential ICE:
rust/compiler/rustc_codegen_ssa/src/mir/analyze.rs
Lines 308 to 314 in a377893
Roughly speaking, the code there is attempting to partition the cleanup blocks into funclets that satisfy a "unique successor" property, and the ICE is set off if that's not possible. This PR documents the well-formedness constraints that MIR must satisfy to avoid setting off that ICE.
The constraints documented are slightly stronger than the cases in which the ICE would have been set off in that code. This is necessary though, since whether or not that ICE gets set off can depend on iteration order in some graphs.
This sort of constraint is kind of ugly, but I don't know a better alternative at the moment. It's worth knowing that two important optimizations are still correct:
There is definitely a MIR opt somewhere that can run afoul of this, but I don't know where it is. @saethlin was able to set it off though, so maybe he'll be able to shed some light on it.
r? @RalfJung I suppose, and cc @tmiasko who might have insight/opinions on this