30% regression in compile time of tuple-stress benchmark #43828

alexcrichton · 2017-08-12T20:52:56Z

According to perf.rust-lang.org graphs the tuple-stress benchmark has regressed 30% in compile time lately, pointing to #43554 as the culprit.

I've downloaded the nightly-2017-08-05 toolchain and the nightly-2017-08-06 toolchain where 08-06 has this PR and 08-05 doesn't. The notable differences in time-passes are:

pass	after	before
borrow checking	2.556	0.920
const checking	1.728	0.451
translation	0.798	0.682
expansion	0.168	0.097

cc @eddyb

The text was updated successfully, but these errors were encountered:

jonas-schievink · 2017-08-12T20:55:35Z

The "after" column is faster than "before", did you swap them by accident?

hanna-kruppe · 2017-08-12T21:03:18Z

It's probably because the test case is almost entirely composed of float literals, and as of #43554 those literals are parsed twice (once with core::num::dec2flt, once with APFloat) as a sanity check.

alexcrichton · 2017-08-12T22:00:05Z

@jonas-schievink oops yes

eddyb · 2017-08-13T02:32:33Z

Only constant checking and MIR building should do the parsing though. Expansion, for example, couldn't possibly be affected - I suspect there's more than one factor hiding in here.

eddyb · 2017-08-13T02:44:58Z

Another thing is that "const checking" is quadratic in the number of expression nodes (which are considered to be constant) that are part of the same subtree, as const-evaluation isn't cached (yet).
I could try doing that (and/or cache the results of parsing float literals).

nikomatsakis · 2017-08-17T21:17:42Z

triage: P-high

Assigning to @eddyb to decide if there is something to be concerned about here or not. =)

eddyb · 2017-08-18T02:55:20Z

On the graph "expansion" has a small random jump from 0.25 to 0.28 but then it goes back down to 0.25, so that's probably a false positive elsewhere too.

The culprits (adding up to most of the total change):

"const checking": 0.85 -> 3.48
"borrow checking" (actually MIR building): 1.75 -> 4.51

Sadly, my idea for caching const-evaluation wouldn't do much for MIR building, have to look further.

Speed up APFloat division by using short division for small divisors. Fixes #43828 (hopefully), by not doing long division bit-by-bit for small divisors. When parsing the ~200,000 decimal float literals in the `tuple-stress` benchmark, this change brings roughly a 5x speed increase (from `0.6s` to `0.12s`), and the hottest instructions are native `div`s.

eddyb · 2017-08-25T07:24:14Z

The graph looks pretty good - still about half a second more, but the regression is much less worse.
@alexcrichton Are you satisfied with this outcome? Or do you want to reopen the issue?

alexcrichton · 2017-08-25T13:58:03Z

Oh no I was personally satisfied with any outcome, I just figured it'd be good to track regressions as they came up!

Thanks for the fix @eddyb!

shepmaster · 2017-08-25T14:20:03Z

IMO, if reopening the issue makes @eddyb perform another "5x speed increase" then we should trick them into doing such 😇

alexcrichton added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 12, 2017

alexcrichton mentioned this issue Aug 12, 2017

APFloat: Rewrite It In Rust and use it for deterministic floating-point CTFE. #43554

Merged

Mark-Simulacrum added the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Aug 13, 2017

alexcrichton added the I-compiletime Issue: Problems and improvements with respect to compile times. label Aug 13, 2017

rust-highfive added the P-high High priority label Aug 17, 2017

nikomatsakis assigned eddyb Aug 17, 2017

eddyb mentioned this issue Aug 22, 2017

Speed up APFloat division by using short division for small divisors. #44051

Merged

alexcrichton modified the milestone: 1.21 Aug 23, 2017

bors closed this as completed in #44051 Aug 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

30% regression in compile time of tuple-stress benchmark #43828

30% regression in compile time of tuple-stress benchmark #43828

alexcrichton commented Aug 12, 2017 •

edited by shepmaster

Loading

jonas-schievink commented Aug 12, 2017

hanna-kruppe commented Aug 12, 2017 •

edited

Loading

alexcrichton commented Aug 12, 2017

eddyb commented Aug 13, 2017

eddyb commented Aug 13, 2017

nikomatsakis commented Aug 17, 2017

eddyb commented Aug 18, 2017

eddyb commented Aug 25, 2017

alexcrichton commented Aug 25, 2017

shepmaster commented Aug 25, 2017

30% regression in compile time of tuple-stress benchmark #43828

30% regression in compile time of tuple-stress benchmark #43828

Comments

alexcrichton commented Aug 12, 2017 • edited by shepmaster Loading

jonas-schievink commented Aug 12, 2017

hanna-kruppe commented Aug 12, 2017 • edited Loading

alexcrichton commented Aug 12, 2017

eddyb commented Aug 13, 2017

eddyb commented Aug 13, 2017

nikomatsakis commented Aug 17, 2017

eddyb commented Aug 18, 2017

eddyb commented Aug 25, 2017

alexcrichton commented Aug 25, 2017

shepmaster commented Aug 25, 2017

alexcrichton commented Aug 12, 2017 •

edited by shepmaster

Loading

hanna-kruppe commented Aug 12, 2017 •

edited

Loading