Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typechecking pass takes an extreme amount of time on trivial programs #18208

Closed
bstrie opened this issue Oct 21, 2014 · 9 comments
Closed

Typechecking pass takes an extreme amount of time on trivial programs #18208

bstrie opened this issue Oct 21, 2014 · 9 comments
Assignees
Labels
I-compiletime Issue: Problems and improvements with respect to compile times.

Comments

@bstrie
Copy link
Contributor

bstrie commented Oct 21, 2014

fn main() {
    println!("Hello, world");
}

Take this simple program and modify it by adding more consecutive println!("Hello world"); lines. Compile it with -Z time-passes and observe how long typechecking takes. I've compiled a table below.

Calls to println! Typechecking time (seconds)
1 0.001
2 0.001
4 0.003
8 0.003
16 0.007
32 0.014
64 0.048
128 0.119
256 0.386
512 1.316
1024 4.893
2048 19.227
4096 82.713
8192 Sits at typechecking for minutes, then fails with "Illegal instruction"

Here's the full output of time-passes for the 4096 case, to see the enormous disparity between typechecking and the other passes (though liveness is another offender):

time: 0.022 s   parsing
time: 0.002 s   gated feature checking
time: 0.000 s   crate injection
time: 0.003 s   configuration 1
time: 0.003 s   plugin loading
time: 0.000 s   plugin registration
time: 0.474 s   expansion
time: 0.036 s   configuration 2
time: 0.036 s   maybe building test harness
time: 0.032 s   prelude injection
time: 0.005 s   checking that all macro invocations are gone
time: 0.045 s   assigning node ids and indexing ast
time: 0.005 s   external crate/lib resolution
time: 0.008 s   language item collection
time: 0.120 s   resolution
time: 0.009 s   lifetime resolution
time: 0.006 s   looking for entry point
time: 0.004 s   looking for plugin registrar
time: 0.030 s   region resolution
time: 0.004 s   loop checking
time: 0.004 s   stability index
time: 0.004 s   static item recursion checking
time: 0.024 s   type collecting
time: 0.009 s   variance inference
time: 0.080 s   coherence checking
time: 82.713 s  type checking
time: 0.057 s   check static items
time: 0.065 s   const marking
time: 0.005 s   const checking
time: 0.025 s   privacy checking
time: 0.020 s   intrinsic checking
time: 0.015 s   effect checking
time: 0.226 s   match checking
time: 55.143 s  liveness checking
time: 2.312 s   borrow checking
time: 0.104 s   rvalue checking
time: 0.002 s   reachability checking
time: 0.115 s   death checking
time: 0.190 s   lint checking
time: 0.001 s   resolving dependency formats
time: 1.998 s   translation
  time: 0.108 s llvm function passes
  time: 0.025 s llvm module passes
  time: 0.902 s codegen passes
  time: 0.002 s codegen passes
time: 1.179 s   LLVM passes
  time: 0.359 s running linker
time: 0.363 s   linking

This seems to contradict our usual excuse for compiler slowness of "it's not our fault, it's LLVM".

@ghost ghost added I-slow Issue: Problems and improvements with respect to performance of generated code. I-compiletime Issue: Problems and improvements with respect to compile times. and removed I-slow Issue: Problems and improvements with respect to performance of generated code. labels Oct 21, 2014
@bstrie
Copy link
Contributor Author

bstrie commented Oct 21, 2014

09:57 < doener> bstrie: http://is.gd/kckwQ7 -- type checking is O(n^2) with n
                being the number of "let _a = foo(5);" statements in main
10:03 < SiegeLord> Aww, putting type annotations doesn't make it any faster
10:03 < doener> RUST_LOG=rustc::middle::type suggests that it looking for
                Sized, a lot
10:03 < doener> s/it looking/it is looking/
10:04 < jakub--> 0.11 is much faster
10:04 < doener> and that had no dst
10:04 < doener> the plot thickens ;-)
10:05 < jakub--> Could be the new inference algorithm, or many things
10:07 < jakub--> We added some caching to type contents recently after the last
                 regression
10:07 < doener> removing the "let _a = " part makes it fast, too (and the Sized
                obligations are gone from the log)
10:07 < jakub--> so I’m surprised the Sized lookups are still showing up

@dotdash
Copy link
Contributor

dotdash commented Oct 21, 2014

16:37:12 < doener> 1000999 DEBUG:rustc::middle::traits::select: CACHE HIT: cache_skol_trait_ref=<SkolemizedTy(0) as core::kinds::Sized>, candidate=None
16:37:42 < doener> 1000000 cache lookups, with 1000 "let _a: Helper = foo(5i32)" lines

@dotdash
Copy link
Contributor

dotdash commented Oct 21, 2014

This might help, too. The number in front of the lines is how often the line appeared in the output.

   1985 DEBUG:rustc::middle::traits::select: select(Obligation(trait_ref=<<generic #991> as core::kinds::Sized>,depth=0))
   1986 DEBUG:rustc::middle::traits::select: candidate_from_obligation(cache_skol_trait_ref=<SkolemizedTy(0) as core::kinds::Sized>, obligation=ObligationStack(Obligation(trait_ref=<<generic #992> as core::kinds::Sized>,depth=0)))
   1987 DEBUG:rustc::middle::traits::select: select(Obligation(trait_ref=<<generic #992> as core::kinds::Sized>,depth=0))
   1988 DEBUG:rustc::middle::traits::select: candidate_from_obligation(cache_skol_trait_ref=<SkolemizedTy(0) as core::kinds::Sized>, obligation=ObligationStack(Obligation(trait_ref=<<generic #993> as core::kinds::Sized>,depth=0)))
   1989 DEBUG:rustc::middle::traits::select: select(Obligation(trait_ref=<<generic #993> as core::kinds::Sized>,depth=0))
   1990 DEBUG:rustc::middle::traits::select: candidate_from_obligation(cache_skol_trait_ref=<SkolemizedTy(0) as core::kinds::Sized>, obligation=ObligationStack(Obligation(trait_ref=<<generic #994> as core::kinds::Sized>,depth=0)))
   1991 DEBUG:rustc::middle::traits::select: select(Obligation(trait_ref=<<generic #994> as core::kinds::Sized>,depth=0))
   1992 DEBUG:rustc::middle::traits::select: candidate_from_obligation(cache_skol_trait_ref=<SkolemizedTy(0) as core::kinds::Sized>, obligation=ObligationStack(Obligation(trait_ref=<<generic #995> as core::kinds::Sized>,depth=0)))
   1993 DEBUG:rustc::middle::traits::select: select(Obligation(trait_ref=<<generic #995> as core::kinds::Sized>,depth=0))
   1994 DEBUG:rustc::middle::traits::select: candidate_from_obligation(cache_skol_trait_ref=<SkolemizedTy(0) as core::kinds::Sized>, obligation=ObligationStack(Obligation(trait_ref=<<generic #996> as core::kinds::Sized>,depth=0)))

@dotdash
Copy link
Contributor

dotdash commented Oct 21, 2014

This seems to be a regression caused by #17197 interacting badly with the Sized builtin bound.

Specifically, I think that this change causes O(n^2) behavior here: https://github.com/rust-lang/rust/pull/17197/files#diff-31c6a728eb925b9f6b0124e93948d270L2372

Given this:

fn some() -> int {
    0
}

fn main() {
    let _a = some();
    let _a = some();
    let _a = some();
}

Ignoring the final () value, We have three Sized obligations, one for each _a pattern. Before the above change happened, the code only tried to resolve the callee_expr, but now we check all remaining obligations for the whole function when trying to check the argument types for the call. Since the _a pattern after the current call still have generic types, their obligations can't be resolved yet. So for the first call we check 3 obligations and resolve 1. For the next call, we check 2 and resolve another one, and so on, resulting in O(n^2) runtime.

Unfortunately I lack the knowledge to properly fix this.

cc @nikomatsakis

@alexcrichton
Copy link
Member

Over the past few days we've seen a 1hr regression in cycle time for the *-nopt-c builders which was tracked down to #18121 (just another data point which looks related to this)

@brson
Copy link
Contributor

brson commented Oct 28, 2014

Nominating because this looks bad and to make sure it's not a limitation of the type checking algo.

@nikomatsakis nikomatsakis self-assigned this Oct 28, 2014
@nikomatsakis
Copy link
Contributor

I'll take a look.

@nikomatsakis
Copy link
Contributor

(Incidentally, I suspect the issues with #18121 are orthogonal to the N^2 issue. I was concerned about the latter but it didn't seem to be causing trouble in practice -- it should be easy enough, though, to hold off on "reattempting" to resolve pending obligations. And it may also be worth special-casing the Sized bound checks, since they in particular are so frequent.)

nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Oct 28, 2014
… only when we are about to report an error (rust-lang#18208). I found it is still important to consider the full set in order to make tests like `let x: Vec<_> = obligations.iter().collect()` work.
bors added a commit that referenced this issue Oct 28, 2014
Avoid O(n^2) performance by reconsidering the full set of obligations only when we are about to report an error (#18208). I found it is still important to consider the full set in order to make tests like `let x: Vec<_> = obligations.iter().collect()` work.

I think we lack the infrastructure to write a regression test for this, but when I did manual testing I found a massive reduction in type-checking time for extreme examples like those found in #18208 vs stage0.

f? @dotdash
nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Nov 6, 2014
@pnkfelix
Copy link
Member

pnkfelix commented Nov 6, 2014

this specific issue has been addressed. @nikomatsakis may open up another issue to track related work on improving compile-time for e.g. method resolution, but that need not block closing this issue. Therefore, closing.

@pnkfelix pnkfelix closed this as completed Nov 6, 2014
nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Nov 7, 2014
bors added a commit that referenced this issue Nov 8, 2014
…2, r=nrc

This is a pretty major refactoring of the method dispatch infrastructure. It is intended to avoid gross inefficiencies and enable caching and other optimizations (e.g. #17995), though it itself doesn't seem to execute particularly faster yet. It also solves some cases where we were failing to resolve methods that we theoretically should have succeeded with.

Fixes #18674.

cc #18208
nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Nov 12, 2014
nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Nov 17, 2014
bors added a commit that referenced this issue Nov 17, 2014
…2, r=nrc

This is a pretty major refactoring of the method dispatch infrastructure. It is intended to avoid gross inefficiencies and enable caching and other optimizations (e.g. #17995), though it itself doesn't seem to execute particularly faster yet. It also solves some cases where we were failing to resolve methods that we theoretically should have succeeded with.

Fixes #18674.

cc #18208
nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Dec 23, 2015
stalled rather than keeping this annoying mark; I checked that the
original compile-time regression that the mark was intended to
fix (rust-lang#18208) was still
reasonable, but I've not done exhaustive measurements to see how
important this "optimization" really is anymore
jroesch pushed a commit to jroesch/rust that referenced this issue Jan 4, 2016
stalled rather than keeping this annoying mark; I checked that the
original compile-time regression that the mark was intended to
fix (rust-lang#18208) was still
reasonable, but I've not done exhaustive measurements to see how
important this "optimization" really is anymore
nikomatsakis added a commit to nikomatsakis/rust that referenced this issue Jan 14, 2016
stalled rather than keeping this annoying mark; I checked that the
original compile-time regression that the mark was intended to
fix (rust-lang#18208) was still
reasonable, but I've not done exhaustive measurements to see how
important this "optimization" really is anymore
lnicola pushed a commit to lnicola/rust that referenced this issue Oct 8, 2024
…ukqt, r=lnicola

internal: allow overriding proc macro server in analysis-stats

Needed this argument in order to profile the proc macro expansion server (c.f., [this Zulip conversation](https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Frust-analyzer/topic/.60macro_rules!.60-based.20macros.20for.20derives.2Fattributes/near/473466794)). I also took the opportunity to change the phrasing for `--proc-macro-srv`.

Ran with `samply record ~/.cargo/bin/rust-analyzer analysis-stats --proc-macro-srv /Users/dbarsky/.cargo/bin/rust-analyzer-proc-macro-srv --parallel .` on rust-analyzer itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-compiletime Issue: Problems and improvements with respect to compile times.
Projects
None yet
Development

No branches or pull requests

6 participants