Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

30% regression in compile time of tuple-stress benchmark #43828

Closed
alexcrichton opened this issue Aug 12, 2017 · 10 comments
Closed

30% regression in compile time of tuple-stress benchmark #43828

alexcrichton opened this issue Aug 12, 2017 · 10 comments
Assignees
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-compiletime Issue: Problems and improvements with respect to compile times. P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Milestone

Comments

@alexcrichton
Copy link
Member

alexcrichton commented Aug 12, 2017

According to perf.rust-lang.org graphs the tuple-stress benchmark has regressed 30% in compile time lately, pointing to #43554 as the culprit.

I've downloaded the nightly-2017-08-05 toolchain and the nightly-2017-08-06 toolchain where 08-06 has this PR and 08-05 doesn't. The notable differences in time-passes are:

pass after before
borrow checking 2.556 0.920
const checking 1.728 0.451
translation 0.798 0.682
expansion 0.168 0.097

cc @eddyb

@alexcrichton alexcrichton added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 12, 2017
@jonas-schievink
Copy link
Contributor

The "after" column is faster than "before", did you swap them by accident?

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Aug 12, 2017

It's probably because the test case is almost entirely composed of float literals, and as of #43554 those literals are parsed twice (once with core::num::dec2flt, once with APFloat) as a sanity check.

@alexcrichton
Copy link
Member Author

@jonas-schievink oops yes

@eddyb
Copy link
Member

eddyb commented Aug 13, 2017

Only constant checking and MIR building should do the parsing though. Expansion, for example, couldn't possibly be affected - I suspect there's more than one factor hiding in here.

@eddyb
Copy link
Member

eddyb commented Aug 13, 2017

Another thing is that "const checking" is quadratic in the number of expression nodes (which are considered to be constant) that are part of the same subtree, as const-evaluation isn't cached (yet).
I could try doing that (and/or cache the results of parsing float literals).

@Mark-Simulacrum Mark-Simulacrum added the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Aug 13, 2017
@alexcrichton alexcrichton added the I-compiletime Issue: Problems and improvements with respect to compile times. label Aug 13, 2017
@nikomatsakis
Copy link
Contributor

triage: P-high

Assigning to @eddyb to decide if there is something to be concerned about here or not. =)

@rust-highfive rust-highfive added the P-high High priority label Aug 17, 2017
@eddyb
Copy link
Member

eddyb commented Aug 18, 2017

On the graph "expansion" has a small random jump from 0.25 to 0.28 but then it goes back down to 0.25, so that's probably a false positive elsewhere too.

The culprits (adding up to most of the total change):

  • "const checking": 0.85 -> 3.48
  • "borrow checking" (actually MIR building): 1.75 -> 4.51

Sadly, my idea for caching const-evaluation wouldn't do much for MIR building, have to look further.

@alexcrichton alexcrichton modified the milestone: 1.21 Aug 23, 2017
bors added a commit that referenced this issue Aug 24, 2017
Speed up APFloat division by using short division for small divisors.

Fixes #43828 (hopefully), by not doing long division bit-by-bit for small divisors.

When parsing the ~200,000 decimal float literals in the `tuple-stress` benchmark, this change brings roughly a 5x speed increase (from `0.6s` to `0.12s`), and the hottest instructions are native `div`s.
@eddyb
Copy link
Member

eddyb commented Aug 25, 2017

The graph looks pretty good - still about half a second more, but the regression is much less worse.
@alexcrichton Are you satisfied with this outcome? Or do you want to reopen the issue?

@alexcrichton
Copy link
Member Author

Oh no I was personally satisfied with any outcome, I just figured it'd be good to track regressions as they came up!

Thanks for the fix @eddyb!

@shepmaster
Copy link
Member

IMO, if reopening the issue makes @eddyb perform another "5x speed increase" then we should trick them into doing such 😇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-compiletime Issue: Problems and improvements with respect to compile times. P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants