Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long compilation time due to LLVM creating excessive landing pads #41696

Closed
not-fl3 opened this issue May 2, 2017 · 32 comments
Closed

Long compilation time due to LLVM creating excessive landing pads #41696

not-fl3 opened this issue May 2, 2017 · 32 comments
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-compiletime Issue: Problems and improvements with respect to compile times. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@not-fl3
Copy link

not-fl3 commented May 2, 2017

Long compilation time (almost infinity on real project) with every nightly versions after 2017-03-18.

I tried this:

git clone https://github.com/not-fl3/incremental_bug
CARGO_INCREMENTAL=1 cargo build --release

I expected to see this happen:
On 2017-03-18 it compiles for less than a second (in 0.59 secs on my machine)

Instead, this happened:

With every version newer than 2017-03-18 it compiles for 389+ seconds on my machine.

Meta

rustc --version --verbose:

rustc 1.19.0-nightly (06fb4d256 2017-04-30)
binary: rustc
commit-hash: 06fb4d25642a3f223db1441972dd5962085cfba1
commit-date: 2017-04-30
host: x86_64-unknown-linux-gnu
release: 1.19.0-nightly
LLVM version: 4.0

Also same behaviour on this versions:
nightly-2017-03-28-x86_64-unknown-linux-gnu
nightly-2017-04-22-x86_64-unknown-linux-gnu
nightly-2017-04-30-x86_64-unknown-linux-gnu

Backtrace:

log of -Z time-asses: https://gist.github.com/not-fl3/8c4ee411fe6c23d4b4abb9a3b6a263d0

@kennytm
Copy link
Member

kennytm commented May 2, 2017

Not reproducible for me on macOS.
$ time RUST_INCREMENTAL=1 cargo build --release
   Compiling incremental_bug v0.0.1 (file:///Users/kennytm/Downloads/incremental_bug)
warning: field is never used: `name`
 --> src/main.rs:9:5
  |
9 |     name : String,
  |     ^^^^^^^^^^^^^
  |
  = note: #[warn(dead_code)] on by default

warning: field is never used: `next`
  --> src/main.rs:11:5
   |
11 |     next : U
   |     ^^^^^^^^
   |
   = note: #[warn(dead_code)] on by default

    Finished release [optimized] target(s) in 1.12 secs

real	0m1.445s
user	0m0.417s
sys	0m0.297s

$ rustc -vV
rustc 1.19.0-nightly (777ee2079 2017-05-01)
binary: rustc
commit-hash: 777ee20796e80a31d4b7c985dd68eda2941460d6
commit-date: 2017-05-01
host: x86_64-apple-darwin
release: 1.19.0-nightly
LLVM version: 4.0

Are you sure nightly-2017-03-18 is fast while nightly-2017-03-19 is slow?

@not-fl3
Copy link
Author

not-fl3 commented May 2, 2017

@kennytm versions I tried: 2017-03-28, 2017-04-22 and 2017-04-30.

$ cargo build --release
   Compiling incremental_bug v0.0.1 (file:///home/fl3/fun/outrage_test/incremental_bug)
warning: field is never used: `name`
 --> src/main.rs:9:5
  |
9 |     name : String,
  |     ^^^^^^^^^^^^^
  |
  = note: #[warn(dead_code)] on by default

warning: field is never used: `next`
  --> src/main.rs:11:5
   |
11 |     next : U
   |     ^^^^^^^^
   |
   = note: #[warn(dead_code)] on by default

    Finished release [optimized] target(s) in 443.30 secs
rustc -vV
rustc 1.19.0-nightly (06fb4d256 2017-04-30)
binary: rustc
commit-hash: 06fb4d25642a3f223db1441972dd5962085cfba1
commit-date: 2017-04-30
host: x86_64-unknown-linux-gnu
release: 1.19.0-nightly
LLVM version: 4.0

Also tried on different linux boxes - same problem.

@kennytm
Copy link
Member

kennytm commented May 2, 2017

@not-fl3 Could you specifically test with 2017-03-18 and 2017-03-19? You may install it from rustup using rustup default nightly-2017-03-18. Just wanna confirm if it is really 2017-03-19 causing the problem for you.

No repro on my Debian VM either.
$ cargo clean

$ time RUST_INCREMENTAL=1 cargo build --release
   Compiling incremental_bug v0.0.1 (file:///root/incremental_bug)
warning: field is never used: `name`
 --> src/main.rs:9:5
  |
9 |     name : String,
  |     ^^^^^^^^^^^^^
  |
  = note: #[warn(dead_code)] on by default

warning: field is never used: `next`
  --> src/main.rs:11:5
   |
11 |     next : U,
   |     ^^^^^^^^
   |
   = note: #[warn(dead_code)] on by default

    Finished release [optimized] target(s) in 0.51 secs

real	0m0.613s
user	0m0.532s
sys	0m0.052s

$ rustc -vV
rustc 1.19.0-nightly (06fb4d256 2017-04-30)
binary: rustc
commit-hash: 06fb4d25642a3f223db1441972dd5962085cfba1
commit-date: 2017-04-30
host: x86_64-unknown-linux-gnu
release: 1.19.0-nightly
LLVM version: 4.0

@not-fl3
Copy link
Author

not-fl3 commented May 2, 2017

@kennytm just tried continuously:

rustup override set nightly-2017-03-19
rustup override set nightly-2017-03-20
rustup override set nightly-2017-03-21

And first broken version is 2017-03-21.

rustc 1.17.0-nightly (134c4a0f0 2017-03-20)
binary: rustc
commit-hash: 134c4a0f08a3d1f55ea8968fbe728fa935c71698
commit-date: 2017-03-20
host: x86_64-unknown-linux-gnu
release: 1.17.0-nightly
LLVM version: 3.9

@not-fl3
Copy link
Author

not-fl3 commented May 2, 2017

@kennytm hmm it seems that on that 03-21 it hangs forever. It works for 20+ minutes and keep working slowly increasing consumed ram.

@kennytm
Copy link
Member

kennytm commented May 2, 2017

The commit for 2017-03-21 is 1.17.0-nightly (134c4a0 2017-03-20)
The commit for 2017-03-20 is 1.17.0-nightly (6eb9960 2017-03-19)

PRs between the two are

$ git log --oneline --merges 6eb9960d3...134c4a0f0
134c4a0f08 Auto merge of #39628 - arielb1:shimmir, r=eddyb
244f893ed7 Auto merge of #40659 - frewsxcv:rollup, r=frewsxcv
6738cd4d47 Auto merge of #40281 - jimmycuadra:try-from-from-str, r=aturon
d49f86901c Rollup merge of #40587 - GuillaumeGomez:rustdoc-const-display, r=frewsxcv
de724ba9a6 Rollup merge of #40581 - TimNN:di-global-40, r=alexcrichton
4e9033124b Rollup merge of #40566 - clarcharr:never_error, r=sfackler
dedf9d3593 Rollup merge of #40554 - nrc:rls-data, r=alexcrichton
7471d9793c Rollup merge of #40532 - jseyfried:improve_tokenstream_quoter, r=nrc
1d1543d566 Rollup merge of #40521 - TimNN:panic-free-shift, r=alexcrichton
6480a00222 Rollup merge of #40398 - eddyb:struct-hint, r=nikomatsakis
573e906594 Rollup merge of #40281 - jimmycuadra:try-from-from-str, r=aturon
e5221f9397 Rollup merge of #40241 - Sawyer47:fix-39997, r=alexcrichton

@nagisa
Copy link
Member

nagisa commented May 2, 2017

Could only reproduce with

cargo +nightly-2017-03-21 rustc --release -- -Zincremental=.

The cause, is of course, the incremental compilation.

@nagisa nagisa added the A-incr-comp Area: Incremental compilation label May 2, 2017
@not-fl3
Copy link
Author

not-fl3 commented May 2, 2017

@kennytm
except RUSTFLAGS typed in this gist I had CARGO_INCREMENTAL=1 in my env.

https://gist.github.com/not-fl3/832b85ec9ff08537e479bb42ed4e2402

@not-fl3
Copy link
Author

not-fl3 commented May 2, 2017

@kennytm checked your log from debian and found that in original post I used RUST_INCREMENTAL instead of CARGO_INCREMENTAL=1.
My bad, fixed original post now.

@kennytm
Copy link
Member

kennytm commented May 2, 2017

Reduced test case:

// rustc -Copt-level=2 -Zincremental=. main.rs

struct DS<U> {
    name: String,
    next: U,
}

fn add<U>(ds: DS<U>, name: String) -> DS<DS<U>> {
    DS {
        name: "?".to_owned(),
        next: ds,
    }
}

fn main() {
    let deserializers = DS { name: "?".to_owned(), next: () };
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned()); // 0.7s
    let deserializers = add(deserializers, "?".to_owned()); // 1.3s
    let deserializers = add(deserializers, "?".to_owned()); // 2.4s
    let deserializers = add(deserializers, "?".to_owned()); // 6.7s
    // let deserializers = add(deserializers, "?".to_owned()); // 26.0s
    // let deserializers = add(deserializers, "?".to_owned()); // 114.0s
}

Looks like some exponential blow-up.


In the 26s case, most time (51%) is spent on the RegisterCoalescer pass.

@kgv
Copy link

kgv commented May 2, 2017

Looks like the same problem:

fn main() {
        let t: std::result::Result<(), ()> = Ok(());
        let f = t
            .into_future()
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(())) // 0.69 secs
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(())) // 1.94 secs
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(()))
            .and_then(|_| future::ok(())) // 20.68 secs
            .and_then(|_| future::ok(())) // 40.58 secs
            .and_then(|_| future::ok(())) // 80.70 secs
            .and_then(|_| future::ok(())) // 160.59 secs
            .and_then(|_| future::ok(())) // 321.88 secs
            .and_then(|_| future::ok(())); // 642.42 secs 
        f.wait();
}

run: cargo run --bin=test_bin
(future - from https://crates.io/crates/futures)

rustc -vV
rustc 1.18.0-nightly (94e884b63 2017-04-27)
binary: rustc
commit-hash: 94e884b6307a59f1e6e64aa7ebc1996b651a7629
commit-date: 2017-04-27
host: x86_64-pc-windows-msvc
release: 1.18.0-nightly
LLVM version: 4.0

@kennytm
Copy link
Member

kennytm commented May 2, 2017

@kgv Not really the same issue. OP's code is slow to compile due to an LLVM pass, your code is slow to compile due to MIR optimization (specifically the "translation item collection" pass).

@arielb1 arielb1 added regression-from-stable-to-beta Performance or correctness regression from stable to beta. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 2, 2017
@arielb1
Copy link
Contributor

arielb1 commented May 2, 2017

Adding -Z no-landing-pads makes things fast again. It looks like the problem is that LLVM somehow manages to cause an exponential # of landing pads while inlining.

@brson brson added the I-slow Issue: Problems and improvements with respect to performance of generated code. label May 4, 2017
@brson
Copy link
Contributor

brson commented May 4, 2017

cc @michaelwoerister

@brson brson added the P-medium Medium priority label May 4, 2017
@nikomatsakis
Copy link
Contributor

Presumably this problem could also be triggered by using a larger number of codegen-units (instead of incremental), right?

@nikomatsakis nikomatsakis changed the title Long compilation time with weird types on fresh nightlies Long compilation time due to LLVM creating excessive landing pads May 4, 2017
@michaelwoerister
Copy link
Member

I don't quite understand how incremental compilation or the number of codegen units could interact with landing pad generation.

@arielb1
Copy link
Contributor

arielb1 commented May 4, 2017

Yeah

@arielb1
Copy link
Contributor

arielb1 commented May 4, 2017

No codegen-units needed:

// rustc -Copt-level=2 main.rs

extern "C" {
    static MAYBE: bool;
}

struct MayUnwind;

impl Drop for MayUnwind {
    fn drop(&mut self) {
        if unsafe { MAYBE } {
            panic!()
        }
    }
}

struct DS<U> {
    may_unwind: MayUnwind,
    name: String,
    next: U,
}

fn add<U>(ds: DS<U>, name: String) -> DS<DS<U>> {
    DS {
        may_unwind: MayUnwind,
        name: "?".to_owned(),
        next: ds,
    }
}

fn main() {
    let deserializers = DS { may_unwind: MayUnwind, name: "?".to_owned(), next: () };
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned());
    let deserializers = add(deserializers, "?".to_owned()); // 0.7s
    let deserializers = add(deserializers, "?".to_owned()); // 1.3s
    let deserializers = add(deserializers, "?".to_owned()); // 2.4s
    let deserializers = add(deserializers, "?".to_owned()); // 6.7s
    // let deserializers = add(deserializers, "?".to_owned()); // 26.0s
    // let deserializers = add(deserializers, "?".to_owned()); // 114.0s
}

arielb1 added a commit to arielb1/rust that referenced this issue May 11, 2017
Apparently LLVM has exponential code growth while inlining landing pads
if that attribute is present.

Fixes rust-lang#41696.
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue May 12, 2017
remove the #[inline] attribute from drop_in_place

Apparently LLVM has exponential code growth while inlining landing pads
if that attribute is present.

Fixes rust-lang#41696.

beta-nominating  because regression.
r? @eddyb
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue May 12, 2017
remove the #[inline] attribute from drop_in_place

Apparently LLVM has exponential code growth while inlining landing pads
if that attribute is present.

Fixes rust-lang#41696.

beta-nominating  because regression.
r? @eddyb
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue May 12, 2017
remove the #[inline] attribute from drop_in_place

Apparently LLVM has exponential code growth while inlining landing pads
if that attribute is present.

Fixes rust-lang#41696.

beta-nominating  because regression.
r? @eddyb
@brson brson added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. and removed regression-from-stable-to-beta Performance or correctness regression from stable to beta. labels Jun 15, 2017
jonhoo added a commit to jonhoo/fantoccini that referenced this issue Jun 20, 2017
Compilation of tests is *really* slow, most likely due to
rust-lang/rust#41696
or
rust-lang/rust#38528
@jonhoo
Copy link
Contributor

jonhoo commented Jun 20, 2017

@brson I don't think this is resolved. Take jonhoo/fantoccini@79e1683 for example. On current nightly (2017-06-18), cargo build is nearly instantaneous, whereas cargo test --lib --no-run takes ~720s on my laptop, presumably due to the sequence of and_then's at https://github.com/jonhoo/fantoccini/blob/79e16830ee7dc502a7ba3c3fde0172825ddbc9ba/src/lib.rs#L1317.

arielb1 added a commit to arielb1/rust that referenced this issue Jun 20, 2017
The unwind path is always cold, so that should not have bad performance
implications.  This avoids catastrophic exponential inlining, and also
decreases the size of librustc.so by 1.5% (OTOH, the size of `libstd.so`
increased by 0.5% for some reason).

Fixes rust-lang#41696.
bors added a commit that referenced this issue Jun 21, 2017
mark calls in the unwind path as !noinline

The unwind path is always cold, so that should not have bad performance
implications.  This avoids catastrophic exponential inlining, and also
decreases the size of librustc.so by 1.5% (OTOH, the size of `libstd.so`
increased by 0.5% for some reason).

Fixes #41696.

r? @nagisa
@jonhoo
Copy link
Contributor

jonhoo commented Jun 23, 2017

@arielb1 #42771 still doesn't seem to solve this for me. cargo test --lib --no-run on jonhoo/fantoccini@0802a57 for example still takes about ten minutes to compile on the latest nightly (which includes #42771). This is presumably because of the many and_then calls in the three tests at the bottom...

EDIT: in fact, compile time seems to have gone up if anything. Used to be ~750s, now 925s...

@bstrie
Copy link
Contributor

bstrie commented Jun 23, 2017

Going to tentatively reopen for now based on @jonhoo's report.

@bstrie bstrie reopened this Jun 23, 2017
@arielb1
Copy link
Contributor

arielb1 commented Jun 23, 2017

@jonhoo

Can you give me the output of compiling your program with -Z time-passes?

@jonhoo
Copy link
Contributor

jonhoo commented Jun 24, 2017

time: 0.007; rss: 55MB  parsing
time: 0.000; rss: 55MB  recursion limit
time: 0.000; rss: 55MB  crate injection
time: 0.000; rss: 55MB  plugin loading
time: 0.000; rss: 55MB  plugin registration
time: 0.099; rss: 112MB expansion
time: 0.003; rss: 112MB maybe building test harness
time: 0.000; rss: 112MB maybe creating a macro crate
time: 0.000; rss: 112MB checking for inline asm in case the target doesn't support it
time: 0.001; rss: 112MB early lint checks
time: 0.000; rss: 112MB AST validation
time: 0.011; rss: 120MB name resolution
time: 0.002; rss: 120MB complete gated feature checking
time: 0.004; rss: 122MB lowering ast -> hir
time: 0.001; rss: 122MB indexing hir
time: 0.000; rss: 122MB attribute checking
time: 0.000; rss: 122MB language item collection
time: 0.000; rss: 122MB lifetime resolution
time: 0.000; rss: 122MB looking for entry point
time: 0.000; rss: 122MB looking for plugin registrar
time: 0.000; rss: 122MB loop checking
time: 0.000; rss: 122MB static item recursion checking
time: 0.005; rss: 122MB compute_incremental_hashes_map
time: 0.000; rss: 122MB load_dep_graph
time: 0.000; rss: 122MB stability index
time: 0.001; rss: 122MB stability checking
time: 0.009; rss: 132MB type collecting
time: 0.000; rss: 132MB impl wf inference
time: 0.031; rss: 154MB coherence checking
time: 0.000; rss: 154MB variance testing
time: 0.006; rss: 154MB wf checking
time: 0.006; rss: 158MB item-types checking
time: 0.205; rss: 164MB item-bodies checking
time: 0.021; rss: 168MB const checking
time: 0.001; rss: 168MB privacy checking
time: 0.000; rss: 168MB intrinsic checking
time: 0.000; rss: 168MB effect checking
time: 0.003; rss: 168MB match checking
time: 0.001; rss: 168MB liveness checking
time: 0.070; rss: 174MB borrow checking
time: 0.000; rss: 174MB reachability checking
time: 0.001; rss: 174MB death checking
time: 0.000; rss: 174MB unused lib feature checking
time: 0.014; rss: 174MB lint checking
time: 0.000; rss: 174MB resolving dependency formats
  time: 0.001; rss: 174MB       write metadata
  time: 495.029; rss: 206MB     translation item collection
  time: 0.012; rss: 214MB       codegen unit partitioning
  time: 0.007; rss: 337MB       internalize symbols
time: 807.783; rss: 337MB       translation
time: 0.000; rss: 337MB assert dep graph
time: 0.000; rss: 337MB serialize dep graph
  time: 0.002; rss: 324MB       codegen passes [0]
  time: 0.205; rss: 326MB       llvm function passes [1]
  time: 0.129; rss: 330MB       llvm module passes [1]
  time: 4.515; rss: 336MB       codegen passes [1]
time: 4.852; rss: 332MB LLVM passes
time: 0.000; rss: 332MB serialize work products
  time: 0.682; rss: 245MB       running linker
time: 0.685; rss: 245MB linking

@jonhoo
Copy link
Contributor

jonhoo commented Jun 24, 2017

  time: 0.001; rss: 174MB       write metadata
  time: 495.029; rss: 206MB     translation item collection
  time: 0.012; rss: 214MB       codegen unit partitioning
  time: 0.007; rss: 337MB       internalize symbols
time: 807.783; rss: 337MB       translation

is pretty curious. the extra time is spent between when codegen unit partitioning is printed and when internalize symbols is printed, but not sure if that helps?

@jonhoo
Copy link
Contributor

jonhoo commented Jun 27, 2017

@arielb1: just checking that you saw my follow-up?

@arielb1
Copy link
Contributor

arielb1 commented Jun 27, 2017

I was a bit busy lately, but:

time: 807.783; rss: 337MB       translation
time: 4.852; rss: 332MB LLVM passes

This bug causes excessive time in LLVM passes, not in translation, so I'm pretty sure it's a different bug. Please open a separate issue.

@arielb1 arielb1 closed this as completed Jun 27, 2017
phrohdoh pushed a commit to phrohdoh/fantoccini that referenced this issue May 11, 2018
Compilation of tests is *really* slow, most likely due to
rust-lang/rust#41696
or
rust-lang/rust#38528
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 1, 2022
Mark drop calls in landing pads `cold` instead of `noinline`

Now that deferred inlining has been disabled in LLVM (rust-lang#92110), this shouldn't cause catastrophic size blowup.

I confirmed that the test cases from rust-lang#41696 (comment) still compile quickly (<1s) after this change. ~Although note that I wasn't able to reproduce the original issue using a recent rustc/llvm with deferred inlining enabled, so those tests may no longer be representative. I was also unable to create a modified test case that reproduced the original issue.~ (edit: I reproduced it on CI by accident--the first commit timed out on the LLVM 12 builder, because I forgot to make it conditional on LLVM version)

r? `@nagisa`
cc `@arielb1` (this effectively reverts rust-lang#42771 "mark calls in the unwind path as !noinline")
cc `@RalfJung` (fixes rust-lang#46515)

edit: also fixes rust-lang#87055
bjorn3 pushed a commit to bjorn3/rustc_codegen_gcc that referenced this issue Jan 30, 2022
Mark drop calls in landing pads `cold` instead of `noinline`

Now that deferred inlining has been disabled in LLVM (#92110), this shouldn't cause catastrophic size blowup.

I confirmed that the test cases from rust-lang/rust#41696 (comment) still compile quickly (<1s) after this change. ~Although note that I wasn't able to reproduce the original issue using a recent rustc/llvm with deferred inlining enabled, so those tests may no longer be representative. I was also unable to create a modified test case that reproduced the original issue.~ (edit: I reproduced it on CI by accident--the first commit timed out on the LLVM 12 builder, because I forgot to make it conditional on LLVM version)

r? `@nagisa`
cc `@arielb1` (this effectively reverts #42771 "mark calls in the unwind path as !noinline")
cc `@RalfJung` (fixes #46515)

edit: also fixes #87055
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-compiletime Issue: Problems and improvements with respect to compile times. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests