-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rewrite the predecessors code to create a reduced graph #39424
Conversation
The old code created a flat listing of "HIR -> WorkProduct" edges. While perfectly general, this could lead to a lot of repetition if the same HIR nodes affect many work-products. This is set to be a problem when we start to skip typeck, since we will be adding a lot more "work-product"-like nodes. The newer code uses an alternative strategy: it "reduces" the graph instead. Basically we walk the dep-graph and convert it to a DAG, where we only keep intermediate nodes if they are used by multiple work-products. This DAG does not contain the same set of nodes as the original graph, but it is guaranteed that (a) every output node is included in the graph and (b) the set of input nodes that can reach each output node is unchanged. (Input nodes are basically HIR nodes and foreign metadata; output nodes are nodes that have assocaited state which we will persist to disk in some way. These are assumed to be disjoint sets.)
@michaelwoerister I did some measurements of syntex-syntax. The rust master takes 3.6 seconds to encode the dep-graph. This branch takes 2.6 seconds. So this seems like a clear win overall. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did one pass over this and it looks very good! I want to take another, closer look at the implementation and tests for the graph reduction algorithm.
let mut len = 0; | ||
while len != dirty_nodes.len() { | ||
len = dirty_nodes.len(); | ||
for edge in edges { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation looks a bit inefficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was lazy. :) I can rewrite it to use a work-list or some such thing. One thing is that we don't have the edges indexed by their target, which we would want here. Perhaps I'll change how things are serialized to be a Map<Target, Vec<Source>>
. I think we even build one of those in the "reduction" algorithm, so perhaps I should just return that (i.e., don't return a Graph
, but some kind of ReducedGraph
). I have to look at how the code works there.
} | ||
|
||
impl DagId { | ||
pub fn from_in_index(n: NodeIndex) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does the in
in from_in_index
come from? Wouldn't from_node_index
be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I meant in
as in "the index in the INPUT graph". Perhaps input_index
?
]); | ||
} | ||
|
||
//#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unported test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes I forgot about that one.
@@ -178,7 +179,9 @@ pub fn encode_dep_graph(preds: &Predecessors, | |||
// Create a flat list of (Input, WorkProduct) edges for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's note quite true anymore...
The old algorithm was O(graph)
@michaelwoerister I addressed (I think) your feedback w/ exception of commented out test, jfyi. |
@michaelwoerister ok, ported the test. |
let mut reduce = GraphReduce::new(&graph, |n| inputs.contains(n), |n| outputs.contains(n)); | ||
Classify::new(&mut reduce).walk(); | ||
|
||
assert!(reduce.in_cycle(nodes("B"), nodes("C"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add assert!(reduce.in_cycle(nodes("A"), nodes("C")));
and assert!(reduce.in_cycle(nodes("A"), nodes("B")));
?
OK, I did another pass. Looks good to me. The graph reduction algorithm is very nice! |
@bors r=mw |
📌 Commit b3096e2 has been approved by |
…3, r=mw rewrite the predecessors code to create a reduced graph The old code created a flat listing of "HIR -> WorkProduct" edges. While perfectly general, this could lead to a lot of repetition if the same HIR nodes affect many work-products. This is set to be a problem when we start to skip typeck, since we will be adding a lot more "work-product"-like nodes. The newer code uses an alternative strategy: it "reduces" the graph instead. Basically we walk the dep-graph and convert it to a DAG, where we only keep intermediate nodes if they are used by multiple work-products. This DAG does not contain the same set of nodes as the original graph, but it is guaranteed that (a) every output node is included in the graph and (b) the set of input nodes that can reach each output node is unchanged. (Input nodes are basically HIR nodes and foreign metadata; output nodes are nodes that have assocaited state which we will persist to disk in some way. These are assumed to be disjoint sets.) r? @michaelwoerister Fixes rust-lang#39494
…3, r=mw rewrite the predecessors code to create a reduced graph The old code created a flat listing of "HIR -> WorkProduct" edges. While perfectly general, this could lead to a lot of repetition if the same HIR nodes affect many work-products. This is set to be a problem when we start to skip typeck, since we will be adding a lot more "work-product"-like nodes. The newer code uses an alternative strategy: it "reduces" the graph instead. Basically we walk the dep-graph and convert it to a DAG, where we only keep intermediate nodes if they are used by multiple work-products. This DAG does not contain the same set of nodes as the original graph, but it is guaranteed that (a) every output node is included in the graph and (b) the set of input nodes that can reach each output node is unchanged. (Input nodes are basically HIR nodes and foreign metadata; output nodes are nodes that have assocaited state which we will persist to disk in some way. These are assumed to be disjoint sets.) r? @michaelwoerister Fixes rust-lang#39494
rewrite the predecessors code to create a reduced graph The old code created a flat listing of "HIR -> WorkProduct" edges. While perfectly general, this could lead to a lot of repetition if the same HIR nodes affect many work-products. This is set to be a problem when we start to skip typeck, since we will be adding a lot more "work-product"-like nodes. The newer code uses an alternative strategy: it "reduces" the graph instead. Basically we walk the dep-graph and convert it to a DAG, where we only keep intermediate nodes if they are used by multiple work-products. This DAG does not contain the same set of nodes as the original graph, but it is guaranteed that (a) every output node is included in the graph and (b) the set of input nodes that can reach each output node is unchanged. (Input nodes are basically HIR nodes and foreign metadata; output nodes are nodes that have assocaited state which we will persist to disk in some way. These are assumed to be disjoint sets.) r? @michaelwoerister Fixes #39494
☀️ Test successful - status-appveyor, status-travis |
The old code created a flat listing of "HIR -> WorkProduct" edges.
While perfectly general, this could lead to a lot of repetition if the
same HIR nodes affect many work-products. This is set to be a problem
when we start to skip typeck, since we will be adding a lot more
"work-product"-like nodes.
The newer code uses an alternative strategy: it "reduces" the graph
instead. Basically we walk the dep-graph and convert it to a DAG, where
we only keep intermediate nodes if they are used by multiple
work-products.
This DAG does not contain the same set of nodes as the original graph,
but it is guaranteed that (a) every output node is included in the graph
and (b) the set of input nodes that can reach each output node is
unchanged.
(Input nodes are basically HIR nodes and foreign metadata; output nodes
are nodes that have assocaited state which we will persist to disk in
some way. These are assumed to be disjoint sets.)
r? @michaelwoerister
Fixes #39494