-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't allocate during default HashSet creation. #36734
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The following `HashMap` creation functions don't allocate heap storage for elements. ``` HashMap::new() HashMap::default() HashMap::with_hasher() ``` This is good, because it's surprisingly common to create a HashMap and never use it. So that case should be cheap. However, `HashSet` does not have the same behaviour. The corresponding creation functions *do* allocate heap storage for the default number of non-zero elements (which is 32 slots for 29 elements). ``` HashMap::new() HashMap::default() HashMap::with_hasher() ``` This commit gives `HashSet` the same behaviour as `HashMap`, by simply calling the corresponding `HashMap` functions (something `HashSet` already does for `with_capacity` and `with_capacity_and_hasher`). It also reformats one existing `HashSet` construction to use a consistent single-line format. This speeds up rustc itself by 1.01--1.04x on most of the non-tiny rustc-benchmarks.
r? @brson (rust_highfive has picked a reviewer for you, use r? to override) |
Some detailed measurements... stage1 rustc (w/glibc malloc) doing debug builds.
stage1 rustc (w/jemalloc) doing debug builds.
Notes:
|
@bors r+ |
📌 Commit 4eb069c has been approved by |
BTW, this problem was identified with Valgrind's DHAT tool. Here is the resulting change in allocations for future-rs-test-all:
and for hyper:
|
bluss
added
the
relnotes
Marks issues that should be documented in the release notes of the next release.
label
Sep 26, 2016
bors
added a commit
that referenced
this pull request
Sep 26, 2016
Don't allocate during default HashSet creation. The following `HashMap` creation functions don't allocate heap storage for elements. ``` HashMap::new() HashMap::default() HashMap::with_hasher() ``` This is good, because it's surprisingly common to create a HashMap and never use it. So that case should be cheap. However, `HashSet` does not have the same behaviour. The corresponding creation functions *do* allocate heap storage for the default number of non-zero elements (which is 32 slots for 29 elements). ``` HashMap::new() HashMap::default() HashMap::with_hasher() ``` This commit gives `HashSet` the same behaviour as `HashMap`, by simply calling the corresponding `HashMap` functions (something `HashSet` already does for `with_capacity` and `with_capacity_and_hasher`). It also reformats one existing `HashSet` construction to use a consistent single-line format. This speeds up rustc itself by 1.01--1.04x on most of the non-tiny rustc-benchmarks.
Nice! |
bors
added a commit
that referenced
this pull request
Apr 28, 2018
Implement LazyBTreeMap and use it in a few places. This is a thin wrapper around BTreeMap that avoids allocating upon creation. I would prefer to change BTreeMap directly to make it lazy (like I did with HashSet in #36734) and I initially attempted that by making BTreeMap::root an Option<>. But then I also had to change Iter and Range to handle trees with no root, and those types have stability markers on them and I wasn't sure if that was acceptable. Also, BTreeMap has a lot of complex code and changing it all was challenging, and I didn't have high confidence about my general approach. So I prototyped this wrapper instead and used it in the hottest locations to get some measurements about the effect. The measurements are pretty good! - Doing a debug build of serde, it reduces the total number of heap allocations from 17,728,709 to 13,359,384, a 25% reduction. The number of bytes allocated drops from 7,474,672,966 to 5,482,308,388, a 27% reduction. - It gives speedups of up to 3.6% on some rustc-perf benchmark jobs. crates.io, futures, and serde benefit most. ``` futures-check avg: -1.9% min: -3.6% max: -0.5% serde-check avg: -2.1% min: -3.5% max: -0.7% crates.io-check avg: -1.7% min: -3.5% max: -0.3% serde avg: -2.0% min: -3.0% max: -0.9% serde-opt avg: -1.8% min: -2.9% max: -0.3% futures avg: -1.5% min: -2.8% max: -0.4% tokio-webpush-simple-check avg: -1.1% min: -2.2% max: -0.1% futures-opt avg: -1.2% min: -2.1% max: -0.4% piston-image-check avg: -0.8% min: -1.1% max: -0.3% crates.io avg: -0.6% min: -1.0% max: -0.3% ``` @gankro, how do you think I should proceed here? Is leaving this as a wrapper reasonable? Or should I try to make BTreeMap itself lazy? If so, can I change the representation of Iter and Range? Thanks!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The following
HashMap
creation functions don't allocate heap storage for elements.This is good, because it's surprisingly common to create a HashMap and never
use it. So that case should be cheap.
However,
HashSet
does not have the same behaviour. The corresponding creationfunctions do allocate heap storage for the default number of non-zero
elements (which is 32 slots for 29 elements).
This commit gives
HashSet
the same behaviour asHashMap
, by simply callingthe corresponding
HashMap
functions (somethingHashSet
already does forwith_capacity
andwith_capacity_and_hasher
). It also reformats one existingHashSet
construction to use a consistent single-line format.This speeds up rustc itself by 1.01--1.04x on most of the non-tiny
rustc-benchmarks.