Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use existing framework for backward dataflow analyses #71006

Merged
merged 11 commits into from
May 3, 2020

Conversation

ecstatic-morse
Copy link
Contributor

@ecstatic-morse ecstatic-morse commented Apr 10, 2020

This PR adds support for backward analyses to the dataflow framework and adds a new live variable analysis (based on the existing one in librustc_mir/util/liveness.rs). By adding these to the framework instead of having a separate API, all newly implemented backward dataflow analyses get cursors/visitors, rustc_peek tests, and graphviz visualizations for free. In the near-term, this makes it much easier to implement global dead-store elimination, and I believe that this will enable even more MIR optimizations in the future.

This PR makes many changes to the dataflow API, since some concepts and terminology only make sense in forward dataflow. Below is a list of the important changes.

  • entry_set -> fixpoint (the fixpoint for backward dataflow problems is after the block's terminator)
  • seek_{before,after} -> seek_{before,after}_primary_effect (the unprefixed dataflow effect is now referred to as the "primary" effect instead of the "after" effect. The "before" effect remains the same, although I considered changing it to the "antecedent" effect. In both backward and forward dataflow, the "before" effect is applied prior to the "primary" effect. I feel very strongly that this is the correct choice, as it means consumers don't have to switch between seek_before and seek_after based on the direction of their analysis.
  • seek_after_assume_call_returns is now gone. Users can use ResultsCursor::apply_custom_effect to emulate it.
  • visit_{statement,terminator}_exit -> visit_{statement,terminator}_after_primary_effect
  • visit_{statement,terminator} -> visit_{statement,terminator}_before_primary_effect

Implementing this also required refactoring the dataflow cursor implementation so it could work in both directions. This is a large percentage of the diff, since the cursor code is rather complex. The fact that the cursor is exhaustively tested in both directions should reassure whomever is unlucky enough to review this 🤣.

In order to avoid computing the reverse CFG for forward dataflow analyses, I've added some hacks to the existing mir::BodyAndCache interface. I've requested changes to this interface that would let me implement this more efficiently.

r? @eddyb (feel free to reassign)
cc @rust-lang/wg-mir-opt

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 10, 2020
@ecstatic-morse ecstatic-morse marked this pull request as draft April 10, 2020 20:29
Copy link
Member

@eddyb eddyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM at a glance

@eddyb
Copy link
Member

eddyb commented Apr 10, 2020

r? @pnkfelix

@rust-highfive rust-highfive assigned pnkfelix and unassigned eddyb Apr 10, 2020
Comment on lines +271 to +273
PlaceContext::MutatingUse(MutatingUseContext::Call)
| PlaceContext::MutatingUse(MutatingUseContext::Yield) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PlaceContext::MutatingUse(MutatingUseContext::Call)
| PlaceContext::MutatingUse(MutatingUseContext::Yield) => {
PlaceContext::MutatingUse(MutatingUseContext::Call | MutatingUseContext::Yield) => {

src/librustc_mir/dataflow/framework/backward.rs Outdated Show resolved Hide resolved
src/librustc_mir/dataflow/framework/backward.rs Outdated Show resolved Hide resolved
src/librustc_mir/dataflow/impls/liveness.rs Outdated Show resolved Hide resolved
@ecstatic-morse
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Apr 11, 2020

⌛ Trying commit 21b30e641d32ff5312258093bb349577d80a6587 with merge 5efb242d5baf509eed2d5015d3d2d01efd652224...

@bors
Copy link
Contributor

bors commented Apr 11, 2020

☀️ Try build successful - checks-azure
Build commit: 5efb242d5baf509eed2d5015d3d2d01efd652224 (5efb242d5baf509eed2d5015d3d2d01efd652224)

@rust-timer
Copy link
Collaborator

Queued 5efb242d5baf509eed2d5015d3d2d01efd652224 with parent 7688266, future comparison URL.

@rust-highfive
Copy link
Collaborator

The job mingw-check of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2020-04-20T01:05:07.7890114Z ========================== Starting Command Output ===========================
2020-04-20T01:05:07.7906628Z [command]/bin/bash --noprofile --norc /home/vsts/work/_temp/478ddafa-a471-41c7-8e74-3da09694fee1.sh
2020-04-20T01:05:07.8112606Z 
2020-04-20T01:05:07.8164104Z ##[section]Finishing: Disable git automatic line ending conversion
2020-04-20T01:05:07.8179120Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/71006/merge to s
2020-04-20T01:05:07.8181993Z Task         : Get sources
2020-04-20T01:05:07.8182222Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-04-20T01:05:07.8182459Z Version      : 1.0.0
2020-04-20T01:05:07.8182614Z Author       : Microsoft
---
2020-04-20T01:05:08.6102294Z ##[command]git remote add origin https://github.com/rust-lang/rust
2020-04-20T01:05:08.6106942Z ##[command]git config gc.auto 0
2020-04-20T01:05:08.6110020Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2020-04-20T01:05:08.6112836Z ##[command]git config --get-all http.proxy
2020-04-20T01:05:08.6118492Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/71006/merge:refs/remotes/pull/71006/merge
---
2020-04-20T01:08:38.8966367Z  ---> 78ad2f4d4aca
2020-04-20T01:08:38.8966560Z Step 6/7 : ENV RUN_CHECK_WITH_PARALLEL_QUERIES 1
2020-04-20T01:08:38.8971296Z  ---> Using cache
2020-04-20T01:08:38.8971714Z  ---> 4d2dc61c4d00
2020-04-20T01:08:38.8973169Z Step 7/7 : ENV SCRIPT python3 ../x.py test src/tools/expand-yaml-anchors &&            python3 ../x.py check --target=i686-pc-windows-gnu --host=i686-pc-windows-gnu &&            python3 ../x.py build --stage 0 src/tools/build-manifest &&            python3 ../x.py test --stage 0 src/tools/compiletest &&            python3 ../x.py test src/tools/tidy &&            /scripts/validate-toolstate.sh
2020-04-20T01:08:38.8974697Z  ---> 776b6266a8b7
2020-04-20T01:08:38.9021633Z Successfully built 776b6266a8b7
2020-04-20T01:08:38.9079987Z Successfully tagged rust-ci:latest
2020-04-20T01:08:38.9312768Z Built container sha256:776b6266a8b7d63e2d3c2b5a784dbf521184a904fb10bf818c6b5c7e1ab74d4a
2020-04-20T01:08:38.9312768Z Built container sha256:776b6266a8b7d63e2d3c2b5a784dbf521184a904fb10bf818c6b5c7e1ab74d4a
2020-04-20T01:08:38.9326970Z Looks like docker image is the same as before, not uploading
2020-04-20T01:08:40.7147279Z [CI_JOB_NAME=mingw-check]
2020-04-20T01:08:40.7454855Z [CI_JOB_NAME=mingw-check]
2020-04-20T01:08:40.7483931Z == clock drift check ==
2020-04-20T01:08:40.7491976Z   local time: Mon Apr 20 01:08:40 UTC 2020
2020-04-20T01:08:41.0187870Z   network time: Mon, 20 Apr 2020 01:08:41 GMT
2020-04-20T01:08:41.0216394Z Starting sccache server...
2020-04-20T01:08:41.1239906Z configure: processing command line
2020-04-20T01:08:41.1240633Z configure: 
2020-04-20T01:08:41.1241738Z configure: rust.parallel-compiler := True
---
2020-04-20T01:12:18.2570113Z     Checking rustc_feature v0.0.0 (/checkout/src/librustc_feature)
2020-04-20T01:12:18.2886140Z     Checking fmt_macros v0.0.0 (/checkout/src/libfmt_macros)
2020-04-20T01:12:18.4614253Z     Checking rustc_ast_pretty v0.0.0 (/checkout/src/librustc_ast_pretty)
2020-04-20T01:12:18.6819274Z     Checking rustc_hir v0.0.0 (/checkout/src/librustc_hir)
2020-04-20T01:12:19.0325666Z     Checking rustc_query_system v0.0.0 (/checkout/src/librustc_query_system)
2020-04-20T01:12:21.2022507Z     Checking rustc_attr v0.0.0 (/checkout/src/librustc_attr)
2020-04-20T01:12:21.6607323Z     Checking rustc_parse v0.0.0 (/checkout/src/librustc_parse)
2020-04-20T01:12:23.5329947Z     Checking rustc_hir_pretty v0.0.0 (/checkout/src/librustc_hir_pretty)
2020-04-20T01:12:23.9341910Z     Checking rustc_ast_lowering v0.0.0 (/checkout/src/librustc_ast_lowering)
---
2020-04-20T01:14:03.4053720Z configure: llvm.assertions      := True
2020-04-20T01:14:03.4054168Z configure: rust.channel         := nightly
2020-04-20T01:14:03.4054811Z configure: rust.debug-assertions := True
2020-04-20T01:14:03.4056778Z configure: build.cargo-native-static := True
2020-04-20T01:14:03.4057801Z configure: build.configure-args := ['--enable-sccache', '--disable-manage-submodu ...
2020-04-20T01:14:03.4058890Z configure: writing `config.toml` in current directory
2020-04-20T01:14:03.4059429Z configure: 
2020-04-20T01:14:03.4060044Z configure: run `python /checkout/x.py --help`
2020-04-20T01:14:03.4060506Z configure: 
---
2020-04-20T01:15:31.2014260Z Hugepagesize:       2048 kB
2020-04-20T01:15:31.2014450Z DirectMap4k:      149440 kB
2020-04-20T01:15:31.2014628Z DirectMap2M:     4044800 kB
2020-04-20T01:15:31.2014807Z DirectMap1G:     5242880 kB
2020-04-20T01:15:31.2085511Z + python3 ../x.py test src/tools/expand-yaml-anchors
2020-04-20T01:15:32.4272388Z Ensuring the YAML anchors in the GitHub Actions config were expanded
2020-04-20T01:15:32.4272388Z Ensuring the YAML anchors in the GitHub Actions config were expanded
2020-04-20T01:15:32.4281448Z Building stage0 tool expand-yaml-anchors (x86_64-unknown-linux-gnu)
2020-04-20T01:15:32.6346788Z    Compiling unicode-xid v0.2.0
2020-04-20T01:15:32.7570424Z    Compiling syn v1.0.11
2020-04-20T01:15:33.5352574Z    Compiling linked-hash-map v0.5.2
2020-04-20T01:15:33.5531758Z    Compiling lazy_static v1.4.0
2020-04-20T01:15:33.5531758Z    Compiling lazy_static v1.4.0
2020-04-20T01:15:33.7409895Z    Compiling yaml-rust v0.4.3
2020-04-20T01:15:37.7272553Z    Compiling quote v1.0.2
2020-04-20T01:15:51.1795661Z    Compiling thiserror-impl v1.0.5
2020-04-20T01:15:55.5974983Z    Compiling thiserror v1.0.5
2020-04-20T01:15:55.6484152Z    Compiling yaml-merge-keys v0.4.0
2020-04-20T01:15:56.7445697Z    Compiling expand-yaml-anchors v0.1.0 (/checkout/src/tools/expand-yaml-anchors)
2020-04-20T01:15:58.4223554Z Build completed successfully in 0:00:27
2020-04-20T01:15:58.4303737Z + python3 ../x.py check --target=i686-pc-windows-gnu --host=i686-pc-windows-gnu
2020-04-20T01:15:58.6681564Z     Finished dev [unoptimized] target(s) in 0.16s
2020-04-20T01:15:59.6332732Z Checking rustdoc artifacts (x86_64-unknown-linux-gnu -> i686-pc-windows-gnu)
---
2020-04-20T01:17:50.1447983Z     Checking rustc_span v0.0.0 (/checkout/src/librustc_span)
2020-04-20T01:17:54.3773869Z     Checking rustc_errors v0.0.0 (/checkout/src/librustc_errors)
2020-04-20T01:17:55.5163940Z     Checking rustc_feature v0.0.0 (/checkout/src/librustc_feature)
2020-04-20T01:17:55.5481545Z     Checking fmt_macros v0.0.0 (/checkout/src/libfmt_macros)
2020-04-20T01:17:55.7246945Z     Checking rustc_query_system v0.0.0 (/checkout/src/librustc_query_system)
2020-04-20T01:17:56.4000564Z     Checking rustc_hir v0.0.0 (/checkout/src/librustc_hir)
2020-04-20T01:17:56.4611692Z     Checking rustc_session v0.0.0 (/checkout/src/librustc_session)
2020-04-20T01:17:57.8839104Z     Checking rustc_attr v0.0.0 (/checkout/src/librustc_attr)
2020-04-20T01:17:58.3153138Z     Checking rustc_parse v0.0.0 (/checkout/src/librustc_parse)
---
2020-04-20T01:21:46.2825764Z skip untracked path cpu-usage.csv during rustfmt invocations
2020-04-20T01:21:46.2830615Z skip untracked path src/doc/book/ during rustfmt invocations
2020-04-20T01:21:46.2835025Z skip untracked path src/doc/rust-by-example/ during rustfmt invocations
2020-04-20T01:21:46.2835980Z skip untracked path src/llvm-project/ during rustfmt invocations
2020-04-20T01:21:51.1168553Z Diff in /checkout/src/librustc_mir/dataflow/framework/direction.rs at line 3:
2020-04-20T01:21:51.1173251Z  use rustc_middle::ty::{self, TyCtxt};
2020-04-20T01:21:51.1177263Z  use std::ops::RangeInclusive;
2020-04-20T01:21:51.1181443Z  
2020-04-20T01:21:51.1185473Z +use super::visitor::{ResultsVisitable, ResultsVisitor};
2020-04-20T01:21:51.1195228Z  use super::{Analysis, Effect, EffectIndex, GenKillAnalysis, GenKillSet};
2020-04-20T01:21:51.1200234Z -use super::visitor::{ResultsVisitor, ResultsVisitable};
2020-04-20T01:21:51.1228355Z  pub trait Direction {
2020-04-20T01:21:51.1228873Z      fn is_forward() -> bool;
2020-04-20T01:21:51.1228873Z      fn is_forward() -> bool;
2020-04-20T01:21:51.1229138Z Diff in /checkout/src/librustc_mir/dataflow/framework/direction.rs at line 11:
2020-04-20T01:21:51.1229374Z  
2020-04-20T01:21:51.1229741Z -    fn is_backward() -> bool { !Self::is_forward() }
2020-04-20T01:21:51.1230109Z +    fn is_backward() -> bool {
2020-04-20T01:21:51.1230310Z +        !Self::is_forward()
2020-04-20T01:21:51.1230566Z  
2020-04-20T01:21:51.1230566Z  
2020-04-20T01:21:51.1230748Z      /// Applies all effects between the given `EffectIndex`s.
2020-04-20T01:21:51.1230954Z      ///
2020-04-20T01:21:51.1231193Z Diff in /checkout/src/librustc_mir/dataflow/framework/direction.rs at line 64:
2020-04-20T01:21:51.1231434Z  pub struct Backward;
2020-04-20T01:21:51.1231713Z  impl Direction for Backward {
2020-04-20T01:21:51.1231713Z  impl Direction for Backward {
2020-04-20T01:21:51.1232057Z -    fn is_forward() -> bool { false }
2020-04-20T01:21:51.1232400Z +    fn is_forward() -> bool {
2020-04-20T01:21:51.1232682Z +    }
2020-04-20T01:21:51.1232800Z  
2020-04-20T01:21:51.1232800Z  
2020-04-20T01:21:51.1232944Z      fn apply_effects_in_block<A>(
2020-04-20T01:21:51.1233125Z          analysis: &A,
2020-04-20T01:21:51.1233379Z Diff in /checkout/src/librustc_mir/dataflow/framework/direction.rs at line 190:
2020-04-20T01:21:51.1233645Z          results: &R,
2020-04-20T01:21:51.1234047Z          vis: &mut impl ResultsVisitor<'mir, 'tcx, FlowState = F>,
2020-04-20T01:21:51.1234257Z      ) where
2020-04-20T01:21:51.1234610Z -        R: ResultsVisitable<'tcx, FlowState = F>
2020-04-20T01:21:51.1235001Z +        R: ResultsVisitable<'tcx, FlowState = F>,
2020-04-20T01:21:51.1235186Z      {
2020-04-20T01:21:51.1235381Z          results.reset_to_block_fixpoint(state, block);
2020-04-20T01:21:51.1235550Z  
2020-04-20T01:21:51.1235771Z Diff in /checkout/src/librustc_mir/dataflow/framework/direction.rs at line 271:
2020-04-20T01:21:51.1236026Z  pub struct Forward;
2020-04-20T01:21:51.1236286Z  impl Direction for Forward {
2020-04-20T01:21:51.1236286Z  impl Direction for Forward {
2020-04-20T01:21:51.1236632Z -    fn is_forward() -> bool { true }
2020-04-20T01:21:51.1236957Z +    fn is_forward() -> bool {
2020-04-20T01:21:51.1237234Z +    }
2020-04-20T01:21:51.1237354Z  
2020-04-20T01:21:51.1237354Z  
2020-04-20T01:21:51.1237497Z      fn apply_effects_in_block<A>(
2020-04-20T01:21:51.1237677Z          analysis: &A,
2020-04-20T01:21:51.1237946Z Diff in /checkout/src/librustc_mir/dataflow/framework/direction.rs at line 393:
2020-04-20T01:21:51.1238196Z          results: &R,
2020-04-20T01:21:51.1238594Z          vis: &mut impl ResultsVisitor<'mir, 'tcx, FlowState = F>,
2020-04-20T01:21:51.1238824Z      ) where
2020-04-20T01:21:51.1239162Z -        R: ResultsVisitable<'tcx, FlowState = F>
2020-04-20T01:21:51.1239552Z +        R: ResultsVisitable<'tcx, FlowState = F>,
2020-04-20T01:21:51.1239744Z      {
2020-04-20T01:21:51.1239927Z          results.reset_to_block_fixpoint(state, block);
2020-04-20T01:21:51.1240095Z  
2020-04-20T01:21:51.1240851Z Running `"/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/rustfmt" "--config-path" "/checkout" "--edition" "2018" "--unstable-features" "--skip-children" "--check" "/checkout/src/librustc_mir/dataflow/framework/direction.rs"` failed.
2020-04-20T01:21:51.1241642Z If you're running `tidy`, try again with `--bless` flag. Or, you just want to format code, run `./x.py fmt` instead.
2020-04-20T01:21:51.1284672Z Build completed unsuccessfully in 0:00:39
2020-04-20T01:21:51.1336830Z == clock drift check ==
2020-04-20T01:21:51.1352011Z   local time: Mon Apr 20 01:21:51 UTC 2020
2020-04-20T01:21:51.1352011Z   local time: Mon Apr 20 01:21:51 UTC 2020
2020-04-20T01:21:51.4102966Z   network time: Mon, 20 Apr 2020 01:21:51 GMT
2020-04-20T01:21:52.8260180Z 
2020-04-20T01:21:52.8260180Z 
2020-04-20T01:21:52.8344880Z ##[error]Bash exited with code '1'.
2020-04-20T01:21:52.8373446Z ##[section]Finishing: Run build
2020-04-20T01:21:52.8418714Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/71006/merge to s
2020-04-20T01:21:52.8423473Z Task         : Get sources
2020-04-20T01:21:52.8423773Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-04-20T01:21:52.8424030Z Version      : 1.0.0
2020-04-20T01:21:52.8424212Z Author       : Microsoft
2020-04-20T01:21:52.8424212Z Author       : Microsoft
2020-04-20T01:21:52.8424518Z Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
2020-04-20T01:21:52.8424846Z ==============================================================================
2020-04-20T01:21:53.1530807Z Cleaning any cached credential from repository: rust-lang/rust (GitHub)
2020-04-20T01:21:53.1570642Z ##[section]Finishing: Checkout rust-lang/rust@refs/pull/71006/merge to s
2020-04-20T01:21:53.1651039Z Cleaning up task key
2020-04-20T01:21:53.1652159Z Start cleaning up orphan processes.
2020-04-20T01:21:53.1820105Z Terminate orphan process: pid (3499) (python)
2020-04-20T01:21:53.2422497Z ##[section]Finishing: Finalize Job

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @rust-lang/infra. (Feature Requests)

@ecstatic-morse ecstatic-morse marked this pull request as ready for review April 20, 2020 04:45
@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented Apr 20, 2020

I'm now using an associated type instead of an associated const to indicate the direction of a dataflow analysis. This allowed me to remove some if forward {} else {} boilerplate. It also made the visitor module much cleaner.

@pnkfelix Let me know what I can do to ease review of this. An intra-block liveness analysis will make some MIR optimizations easier to implement, and I'd like to start experimenting. It will also be useful for #51003.

@ecstatic-morse ecstatic-morse changed the title [WIP] Use existing framework for backward dataflow analyses Use existing framework for backward dataflow analyses Apr 20, 2020
@bors
Copy link
Contributor

bors commented Apr 22, 2020

☔ The latest upstream changes (presumably #71323) made this pull request unmergeable. Please resolve the merge conflicts.

assert!(
cache.predecessors.is_some(),
"Cannot construct ReadOnlyBodyAndCache without computed predecessors"
);
Self { body, cache }
}

pub fn new_unchecked(body: &'a Body<'tcx>, cache: &'a Cache) -> Self {
Copy link
Member

@pnkfelix pnkfelix Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method doesn't have any documentation, so I cannot tell if it is meant to be an efficiency hack (which seems very unlikely, since an assert of is_some() should be incredibly cheap to check), or if it is something where at the call site (or at future expected call sites) you cannot guarantee that cache.predecessors.is_some().

(I briefly tried to infer the intent by looking at the diff, so that I might suggest what comment to write here, or whether to advise removing the method entirely... From what I can tell, at the single call site where this is called, you are invoking cache.ensure_predecessors before this is invoked for the backwards flow case, so that will ensure that cache.predecessors.is_some(). But I'm guessing that predecessors is none for the forwards flow case...)

This is all a long winded way of saying that you should spell out in a doc-comment under what circumstances you expect ReadOnlyBodyAndCache::new_unchecked to be invoked rather than ReadOnlyBodyAndCache::new. (If the whole system doesn't require cache.predecessors.is_some(), then why have the assert in fn new? And if the whole system does require it, then when can this be safely used?)

Copy link
Contributor Author

@ecstatic-morse ecstatic-morse Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part that will be removed after #71044 lands. Sorry. That should have been in the commit message along with HACK.

This is here because we want join_state_into_successors to have the same signature for both forward and backward analyses. A ReadOnlyBodyAndCache is required to access a basic block predecessors, but the current API requires that you always compute and cache the predecessor graph before creating a ReadOnlyBodyAndCache. This would be wasted work for the vast majority of dataflow analyses, which run forward.

FWIW, figuring this out how this was supposed to work caused me a great deal of consternation and was the motivation for #71044, which goes back to using Body everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should fold #71044 into this PR? I'll finish looking at this PR first before I look at that one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh but perhaps you are concerned about performance issues with PR #71044, while you are confident that this PR should not inject performance problems, and thus it would be better to land them separately; is that a correct inference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I was also concerned about regressions from this PR. I wanted to benchmark #71044 separately from this one, since the concerns/potential performance issues are very different in that PR (additional locking and slow serialization) than this one (renaming APIs and additional layers of abstraction).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rebased on the latest master, which includes #71044. The HACK commit has been removed. Now both forward and backward dataflow analyses can use Body everywhere, and the reverse CFG will only be computed when necessary (see the most recent commit).

@pnkfelix
Copy link
Member

pnkfelix commented Apr 22, 2020

A general comment on one particular change:

  • entry_set -> fixpoint (the fixpoint for backward dataflow problems is after the block's terminator)

I believe I understand your motivation for trying to find an alternative to "entry set", but I think this rename makes things more confusing.

I'm going to spell out my thinking explicitly here; little (or perhaps none) of this will be news to you, @ecstatic-morse , but I want to write it down to justify why I think the above rename is ill chosen. You can skip to the Bikeshed section at the end if you like.

Digression

FIxed Points

A fixed point of a function f: D -> D is some input d that maps to itself (i.e. f(d) = d). In the context of a dataflow analysis, the domain D are the bitvectors that correspond to the state of the analysis at various points in the control flow of the program, and the analysis searches for a set of bitvectors such that the transfer functions attached to each block (and the operations for combining them at control flow merge points) end up causing no change to those bitvectors (i.e., each such bitvector is mapped to itself).

In short, to me (in the context of a dataflow analysis) a fixpoint is a point in the state-space for the bitvectors themselves.

Entry sets

In the data flow code itself, the entry set for a block is the bitvector associated with a particular point in the control flow of the block.

The term "fixpoint" for this thing seems inappropriate. The entry set may or may not be a bitvector that has reached a fixed point; in general that will depend on whether the whole analysis has reached a fixed point. And the entry set is definitely used in the code before the analysis has reached a fixed point.

In any case, it just seems like a mismatch to use a term that describes a point in the analysis state space to describe a part of the data structure that is most strongly associated with control flow.

The term "entry set" is meant to convey that this particular bitvector is the one that represents the analysis state on entry to the block.

As I said above, I can understand wanting to rename this, since one is likely to think that the entry point corresponds to when control flows into the block; but for a backwards analysis, these bitvectors stored for each block will be associated with the point where control flows out of the block (i.e., the point immediately after the block’s Terminator). If “entry” is too strongly correlated with (forward) control-flow entry, then “entry set” is a problem.

Bikeshed

So the options here as I see them are either:

  • Leave the name entry_set as is, but try to clarify in the comments and/or surrounding docs that the word “entry” here means with respect to the direction of the analysis, not the direction of the actual control flow,

  • Keep some reference to the relationship with (control) flow, but try to generalize it beyond “entry” (e.g. edge_state, or state_at_edge), or

  • Abandon mentioning control flow in the name. (E. g. just state or analysis_state).

@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented Apr 22, 2020

@pnkfelix's concern is a valid one. As they hypothesize above, I found it hard to remember whether the "entry" of a block was in control-flow order (always before the first statement) or dataflow order (varies based on the direction of the analysis) while implementing this. I will point out that, once iterate_to_fixpoint is called, fixpoint does hold the dataflow state for each block at fixpoint. However, I see how using an abstract term to refer to a concrete field that is frequently mutating would cause confusion. I'm happy to change it.

I think that being very clear about the definition of "entry" within the dataflow framework is a valid approach. To me, this implies that we would refer to positions in control-flow order using "start" and "end" rather than "entry" and "exit". One of the reasons I chose "entry" and "exit" over "start" and "end" for control-flow order was somewhat silly: The term "block start" would frequently appear next to the BasicBlock index mir::START_BLOCK.

So re: the bikeshed, I would choose to use "block start" and "block end" while preserving the name "entry set" for the dataflow state. The other alternatives are also acceptable.

@bors
Copy link
Contributor

bors commented Apr 25, 2020

☔ The latest upstream changes (presumably #71539) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 1, 2020
@bors
Copy link
Contributor

bors commented May 3, 2020

⌛ Testing commit 512b1edc88ce854301fd928bf19bedd02d51eae8 with merge 997203d00bb9b07794a632d4e8e0976428e04a3c...

@bors
Copy link
Contributor

bors commented May 3, 2020

💔 Test failed - checks-azure

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 3, 2020
@ecstatic-morse
Copy link
Contributor Author

@bors r+

@bors
Copy link
Contributor

bors commented May 3, 2020

📌 Commit 21c72b6 has been approved by ecstatic-morse

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 3, 2020
@bors
Copy link
Contributor

bors commented May 3, 2020

⌛ Testing commit 21c72b6 with merge 65b4482...

@bors
Copy link
Contributor

bors commented May 3, 2020

☀️ Test successful - checks-azure
Approved by: ecstatic-morse
Pushing 65b4482 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 3, 2020
@bors bors merged commit 65b4482 into rust-lang:master May 3, 2020
@ecstatic-morse ecstatic-morse deleted the dataflow-bidi branch October 6, 2020 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants