-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase Span
from 4 bytes to 8 bytes.
#59693
Conversation
I was pleasantly surprised how large the improvements are. Some instruction counts results for Check builds:
Here is part of a Cachegrind diff showing the changes for a Check-CleanIncr build of
Mostly TLS stuff and interning stuff, which makes sense. |
@bors try |
Increase `Span` from 4 bytes to 8 bytes. This increases the size of some important types, such as `ast::Expr` and `mir::Statement`. However, it drastically reduces how much the interner is used, and the fields are more natural sizes that don't require bit operations to extract. As a result, instruction counts drop across a range of workloads, by as much as 10% for `script-servo` incremental builds. Peak memory usage goes up a little for some cases, but down by more for some other cases -- as much as 18% for non-incremental builds of `packed-simd`. The commit also: - removes the `repr(packed)`, because it has negligible effect, but can cause undefined behaviour; - replaces explicit impls of common traits (`Copy`, `PartialEq`, etc.) with derived ones. r? @petrochenkov
It's interesting that TLS is so expensive. Makes me wonder if other TLS data (e.g. symbols) could be done some other way. |
Actually, I don't understand |
Previous discussion for reference - #58458. |
That's entirely @Zoxc's creation, I don't understand our threading setup. |
@nnethercote |
Is the interned case uncommon enough to throw |
☀️ Try build successful - checks-travis |
Yes! It's very rare for len to hit 16 bits in any code. But ctxt depends on the crate size, so in very large crates a decent number of ctxts could require 16 bits. In fact, ideally I'd probably use something like 11 bits for len and 20 for ctxt, but those sizes aren't as natural. I'll add a comment about this. |
It's fine by me, though I thought they were generally frowned upon in Rust code... |
@rust-timer build cfa7765 |
Success: Queued cfa7765 with parent f717b58, comparison URL. |
That's what I expected, ctxt behaves in a similar way to base and can just grow and grow, and that's different from len. |
Finished benchmarking try commit cfa7765 |
The CI results are even better than what I saw locally, yay. |
This increases the size of some important types, such as `ast::Expr` and `mir::Statement`. However, it drastically reduces how much the interner is used, and the fields are more natural sizes that don't require bit operations to extract. As a result, instruction counts drop across a range of workloads, by as much as 12% for incremental "check" builds of `script-servo`. Peak memory usage goes up a little for some cases, but down by more for some other cases -- as much as 18% for non-incremental builds of `packed-simd`. The commit also: - removes the `repr(packed)`, because it has negligible effect, but can cause undefined behaviour; - replaces explicit impls of common traits (`Copy`, `PartialEq`, etc.) with derived ones.
7f2ca81
to
fd7f605
Compare
@petrochenkov: I have updated the comments as per the suggestions above. And thanks for suggesting 64-bits in the first place! :) |
The previous measurements (#44646 (comment)) for 64 bits looked very different (that's why 32 was selected in the first place). I wonder what changed since Sep 2017. (I'll review the code this evening.) |
Code LGTM, but I'd want to run #59749 through perf before merging this. |
I see, then I'm not sure why it's per-session and not global. |
There can be multiple rustc sessions in a process. RLS and rustdoc does that. |
I mean, why can't multiple sessions share a single interner? |
@bors rollup |
…enkov Increase `Span` from 4 bytes to 8 bytes. This increases the size of some important types, such as `ast::Expr` and `mir::Statement`. However, it drastically reduces how much the interner is used, and the fields are more natural sizes that don't require bit operations to extract. As a result, instruction counts drop across a range of workloads, by as much as 10% for `script-servo` incremental builds. Peak memory usage goes up a little for some cases, but down by more for some other cases -- as much as 18% for non-incremental builds of `packed-simd`. The commit also: - removes the `repr(packed)`, because it has negligible effect, but can cause undefined behaviour; - replaces explicit impls of common traits (`Copy`, `PartialEq`, etc.) with derived ones. r? @petrochenkov
It's too late now, but for next time: PRs with significant perf effects are better landed on their own, rather than in a rollup, to keep the perf data clear. For example, if some other PR in the rollup introduces a perf regression, that would be masked by the improvement in this PR. |
@bors rollup- |
@bors p=1 |
Increase `Span` from 4 bytes to 8 bytes. This increases the size of some important types, such as `ast::Expr` and `mir::Statement`. However, it drastically reduces how much the interner is used, and the fields are more natural sizes that don't require bit operations to extract. As a result, instruction counts drop across a range of workloads, by as much as 10% for `script-servo` incremental builds. Peak memory usage goes up a little for some cases, but down by more for some other cases -- as much as 18% for non-incremental builds of `packed-simd`. The commit also: - removes the `repr(packed)`, because it has negligible effect, but can cause undefined behaviour; - replaces explicit impls of common traits (`Copy`, `PartialEq`, etc.) with derived ones. r? @petrochenkov
💔 Test failed - checks-travis |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
"warning: spurious network error" @bors retry |
Increase `Span` from 4 bytes to 8 bytes. This increases the size of some important types, such as `ast::Expr` and `mir::Statement`. However, it drastically reduces how much the interner is used, and the fields are more natural sizes that don't require bit operations to extract. As a result, instruction counts drop across a range of workloads, by as much as 10% for `script-servo` incremental builds. Peak memory usage goes up a little for some cases, but down by more for some other cases -- as much as 18% for non-incremental builds of `packed-simd`. The commit also: - removes the `repr(packed)`, because it has negligible effect, but can cause undefined behaviour; - replaces explicit impls of common traits (`Copy`, `PartialEq`, etc.) with derived ones. r? @petrochenkov
☀️ Test successful - checks-travis, status-appveyor |
Btw, there may be a way to fit The reason is that we could allocate new "files" (or maybe even just one chunk, since it should often suffice) for the version with a And so two values (file + |
swc_common: - apply patch from rust-lang/rust#59693 swc: - use &Options instead of Options - configures commons::CM - exposes `handler`
This increases the size of some important types, such as
ast::Expr
andmir::Statement
. However, it drastically reduces how much the interneris used, and the fields are more natural sizes that don't require bit
operations to extract.
As a result, instruction counts drop across a range of workloads, by as
much as 10% for
script-servo
incremental builds.Peak memory usage goes up a little for some cases, but down by more for
some other cases -- as much as 18% for non-incremental builds of
packed-simd
.The commit also:
repr(packed)
, because it has negligible effect, but cancause undefined behaviour;
Copy
,PartialEq
, etc.)with derived ones.
r? @petrochenkov