-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for HashMap::raw_entry #56167
Comments
What is the motivation for having separate Could we consider merging these methods into a single one? Or is there some use case where the difference in behavior is useful? |
I am also extremely confused by this distinction, as my original designs didn't include them (I think?) and the documentation that was written is very unclear. |
cc @fintelia |
The reason I added let key = map.iter().nth(rand() % map.len()).0.clone();
map.remove(&key); I wanted to just be able to pick a random "bucket" and then get an entry/raw entry to the first element in it if any: loop {
if let Occupied(o) = map.raw_entry_mut().search_bucket(rand(), || true) {
o.remove();
break;
}
} (the probabilities aren't uniform in the second version, but close enough for my purposes) |
I continue to not want to support the "random deletion" usecase in std's HashMap. You really, really, really, should be using a linked hashmap or otherwise ordered map for that. |
I have removed this method in the hashbrown PR (#56241). Your code snippet for random deletion won't work with hashbrown anyways since it always checks the hash as part of the search process. |
It doesn't work in hashbrown anyways (see rust-lang#56167)
I can avoid unnecessary clones inherent to the original #![feature(hash_raw_entry)]
use std::collections::HashMap;
let mut map = HashMap::new();
map.raw_entry_mut()
.from_key("poneyland")
.or_insert("poneyland", 3); Currently I use the following function to hash once and automatically provide an owned key if necessary (somewhat similar to what was discussed in rust-lang/rfcs#1769): use std::borrow::Borrow;
use std::collections::hash_map::RawEntryMut;
use std::hash::{BuildHasher, Hash, Hasher};
fn get_mut_or_insert_with<'a, K, V, Q, F>(
map: &'a mut HashMap<K, V>,
key: &Q,
default: F,
) -> &'a mut V
where
K: Eq + Hash + Borrow<Q>,
Q: Eq + Hash + ToOwned<Owned = K>,
F: FnOnce() -> V,
{
let mut hasher = map.hasher().build_hasher();
key.hash(&mut hasher);
let hash = hasher.finish();
match map.raw_entry_mut().from_key_hashed_nocheck(hash, key) {
RawEntryMut::Occupied(entry) => entry.into_mut(),
RawEntryMut::Vacant(entry) => {
entry
.insert_hashed_nocheck(hash, key.to_owned(), default())
.1
}
}
} Given If there isn't, why not saving the hash in |
I'm not yet very familiar with this API, but what @gdouezangrard suggested seems like a great idea to me. What even happens currently if the two hashes don't match, is the key-value pair then inserted into the wrong bucket? It's not clear to me from (quickly) reading the source code. |
I submitted rust-lang/hashbrown#54 to support using a If so, I'd be happy to submit a PR! |
This is a really great API, it's also what keeps crates ( What could be next steps here towards stabilization? |
Just gonna add another ping here -- what's blocking this right now? |
I see a few things that need to be resolved:
I would recommend prototyping in the hashbrown crate first, which can then be ported back in the the std HashMap. |
I find I also would like to point out that #![feature(hash_raw_entry)]
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.raw_entry_mut().from_key(&42).or_insert(1, 2);
println!("{}", map[&1]);
} This is a bit like calling #![feature(hash_raw_entry)]
use std::collections::hash_map::{HashMap, RawEntryMut};
fn main() {
let mut map = HashMap::new();
if let RawEntryMut::Vacant(_) = map.raw_entry_mut().from_key(&42) {
map.insert(1, 2);
}
println!("{}", map[&1]);
} I think raw entry API is useful, but I don't think its API should be conflated with entry API. |
As discussed here: rust-lang/hashbrown#232
If the feature of a user specified hash is needed, it may be useful to instead provide a method on the raw entry to hash a key. That way the hashmap can implement this however it sees fit and the application code is less error prone because there is an unambiguous way to obtain the hash value if it is not known in advance. |
For anyone reading this RFC while exploring a successor/better alternative to The (un)released let mut new_pool: bool = false;
let (_, old_status) = cached_status.raw_entry_mut()
.from_key(pool)
.or_insert_with(|| { new_pool = true; (pool.to_owned(), *new_status) });
if new_pool || old_status != new_status {
eprintln!("{}: {:?}", pool, status);
*old_status = *status;
} which could have just become let (_, old_status, new_pool) = cached_status.raw_entry_mut()
.from_key(pool)
.or_insert_with(|| (pool.to_owned(), *new_status));
if new_pool || old_status != new_status {
eprintln!("{}: {:?}", pool, status);
*old_status = *status;
} The "regular" |
Is there a reason for the |
Is there a reason to overload this into std::collections::HashMap? If instead we had a I think that could get rid of the (at least to me) confusingly named In fact, I don't think you'd even need a Thoughts? I'm curious who's driving this / who I'd need to be coordinate with if there's interest in this alternative path. |
Many people and APIs already use
+1 on this. To be more intuitive I would also rename the
IMO this would be useful only if it was accessible from an instance of
Definitely need a way to use an external function since many usecases rely on that.
Currently |
What I'm suggesting would be that HashMap and HashSet would just expose a raw/raw_mut to return the underlying RawTable. As is, anyone interested in using the raw interface still has to play this dance pretending like there's a hasher which makes things confusing to maintain (in case someone in your codebase doesnt realize your only using the raw interface and decides to store keys directly). Doing a lift and shift doesn't honestly seem like that much work and I don't see why long term maintenance work would increase with this approach. |
I would like to deprecate and remove
Note that @rfcbot fcp close |
Team member @Amanieu has proposed to close this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Doesn't this decision mean binary bloat because std::hashmap will exist because some dependency somewhere uses it and a duplicate implementation would be pulled in through hashbrown for those that need HashTable? Is there a reason not to standardize HashTable? EDIT: To be clear, I'm not against this, but I'd like to understand if there's an interest in standardizing |
I haven't been following this that closely, so for the benefit of those not paying a lot of attention, let's say that a user wants to do the raw_entry thing with an expensive key that they only want to construct if the map doesn't have it (ie, https://stackoverflow.com/questions/51542024/how-do-i-use-the-entry-api-with-an-expensive-key-that-is-only-constructed-if-the). Is the answer of using |
|
🔔 This is now entering its final comment period, as per the review above. 🔔 |
Checking my box here, and hoping that HashTable and some portion of RawTable (enough for |
The final comment period, with a disposition to close, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. |
Added in #54043.
As of 6ecad33 / 2019-01-09, this feature covers:
… as well as
Debug
impls for each 5 new types, and their inherent methods.The text was updated successfully, but these errors were encountered: