Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make hashmap iterators implement ExactSize #19327

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 28 additions & 1 deletion src/libstd/collections/hash/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ use cmp::{max, Eq, Equiv, PartialEq};
use default::Default;
use fmt::{mod, Show};
use hash::{Hash, Hasher, RandomSipHasher};
use iter::{mod, Iterator, FromIterator, Extend};
use iter::{mod, Iterator, DoubleEndedIterator, ExactSize, FromIterator, Extend};
use kinds::Sized;
use mem::{mod, replace};
use num::UnsignedInt;
Expand Down Expand Up @@ -1325,6 +1325,15 @@ impl<'a, K, V> Iterator<(&'a K, &'a V)> for Entries<'a, K, V> {
}
}

impl<'a, K, V> DoubleEndedIterator<(&'a K, &'a V)> for Entries<'a, K, V> {
#[inline]
fn next_back(&mut self) -> Option<(&'a K, &'a V)> {
self.inner.next_back()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hashmap is fundamentally unordered. Semantically, what does it even mean to iterate backwards? Couldn't this just call next and call it a day? Or am I missing a usecase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be surprising if forward iteration and backward iteration returned the same series of elements. Anyhow, I only implemented DoubleEndedIterator because it's needed for ExactSize for some reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm fine with impl'ing DoubleEnded and ExactSize for usability with that tooling. I don't however think people should be able to rely on the order of elements when you call next/next_back between iterations. I think it's fine to have next_back be a synonym for next here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would go against the spirit of DoubleEndedIterator. We should at least be doing something "backwards-ish" or at least as backwards as makes sense for a hashtable. People might have a usecase for it and I feel like we shouldn't just lie in the interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the argument either way, honestly. Plus we can explicitly document that behaviour can't be relied on for different sequences of next/next_back. Which is backwards-compatible to "undo". I'm not a huge fan of duplicating a bunch on unsafe code in the already-beefy rawtable. Making next=next_back seems like a simple way to do that.

Interested in getting some other opinions though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too. I wish someone "owned" this code. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably @pczarn if anyone. Interested in @aturon's API-thoughts though.

}
}

impl<'a, K, V> ExactSize<(&'a K, &'a V)> for Entries<'a, K, V> {}

impl<'a, K, V> Iterator<(&'a K, &'a mut V)> for MutEntries<'a, K, V> {
#[inline]
fn next(&mut self) -> Option<(&'a K, &'a mut V)> {
Expand All @@ -1336,6 +1345,15 @@ impl<'a, K, V> Iterator<(&'a K, &'a mut V)> for MutEntries<'a, K, V> {
}
}

impl<'a, K, V> DoubleEndedIterator<(&'a K, &'a mut V)> for MutEntries<'a, K, V> {
#[inline]
fn next_back(&mut self) -> Option<(&'a K, &'a mut V)> {
self.inner.next_back()
}
}

impl<'a, K, V> ExactSize<(&'a K, &'a mut V)> for MutEntries<'a, K, V> {}

impl<K, V> Iterator<(K, V)> for MoveEntries<K, V> {
#[inline]
fn next(&mut self) -> Option<(K, V)> {
Expand All @@ -1347,6 +1365,15 @@ impl<K, V> Iterator<(K, V)> for MoveEntries<K, V> {
}
}

impl<K, V> DoubleEndedIterator<(K, V)> for MoveEntries<K, V> {
#[inline]
fn next_back(&mut self) -> Option<(K, V)> {
self.inner.next_back()
}
}

impl<K, V> ExactSize<(K, V)> for MoveEntries<K, V> {}

impl<'a, K, V> OccupiedEntry<'a, K, V> {
/// Gets a reference to the value in the entry
pub fn get(&self) -> &V {
Expand Down
103 changes: 91 additions & 12 deletions src/libstd/collections/hash/table.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ pub use self::BucketState::*;
use clone::Clone;
use cmp;
use hash::{Hash, Hasher};
use iter::{Iterator, count};
use iter::{Iterator, DoubleEndedIterator, ExactSize, count};
use kinds::{Sized, marker};
use mem::{min_align_of, size_of};
use mem;
Expand Down Expand Up @@ -620,6 +620,25 @@ impl<K, V> RawTable<K, V> {
}
}

fn one_past_last_bucket_raw(&self) -> RawBucket<K, V> {
let hashes_size = self.capacity * size_of::<u64>();
let keys_size = self.capacity * size_of::<K>();
let vals_size = self.capacity * size_of::<V>();

let buffer = self.hashes as *mut u8;
let (keys_offset, vals_offset) = calculate_offsets(hashes_size,
keys_size, min_align_of::<K>(),
min_align_of::<V>());

unsafe {
RawBucket {
hash: self.hashes.offset(hashes_size as int),
key: buffer.offset(keys_offset as int + keys_size as int) as *mut K,
val: buffer.offset(vals_offset as int + vals_size as int) as *mut V
}
}
}

/// Creates a new raw table from a given capacity. All buckets are
/// initially empty.
#[allow(experimental)]
Expand All @@ -644,10 +663,8 @@ impl<K, V> RawTable<K, V> {

fn raw_buckets(&self) -> RawBuckets<K, V> {
RawBuckets {
raw: self.first_bucket_raw(),
hashes_end: unsafe {
self.hashes.offset(self.capacity as int)
},
start: self.first_bucket_raw(),
end: self.one_past_last_bucket_raw(),
marker: marker::ContravariantLifetime,
}
}
Expand All @@ -667,12 +684,12 @@ impl<K, V> RawTable<K, V> {
}

pub fn into_iter(self) -> MoveEntries<K, V> {
let RawBuckets { raw, hashes_end, .. } = self.raw_buckets();
let RawBuckets { start, end, .. } = self.raw_buckets();
// Replace the marker regardless of lifetime bounds on parameters.
MoveEntries {
iter: RawBuckets {
raw: raw,
hashes_end: hashes_end,
start: start,
end: end,
marker: marker::ContravariantLifetime,
},
table: self,
Expand All @@ -695,18 +712,18 @@ impl<K, V> RawTable<K, V> {
/// A raw iterator. The basis for some other iterators in this module. Although
/// this interface is safe, it's not used outside this module.
struct RawBuckets<'a, K, V> {
raw: RawBucket<K, V>,
hashes_end: *mut u64,
start: RawBucket<K, V>,
end: RawBucket<K, V>, // points one after the end.
marker: marker::ContravariantLifetime<'a>,
}

impl<'a, K, V> Iterator<RawBucket<K, V>> for RawBuckets<'a, K, V> {
fn next(&mut self) -> Option<RawBucket<K, V>> {
while self.raw.hash != self.hashes_end {
while self.start.hash != self.end.hash {
unsafe {
// We are swapping out the pointer to a bucket and replacing
// it with the pointer to the next one.
let prev = ptr::replace(&mut self.raw, self.raw.offset(1));
let prev = ptr::replace(&mut self.start, self.start.offset(1));
if *prev.hash != EMPTY_BUCKET {
return Some(prev);
}
Expand All @@ -717,6 +734,21 @@ impl<'a, K, V> Iterator<RawBucket<K, V>> for RawBuckets<'a, K, V> {
}
}

impl<'a, K, V> DoubleEndedIterator<RawBucket<K, V>> for RawBuckets<'a, K, V> {
fn next_back(&mut self) -> Option<RawBucket<K, V>> {
while self.start.hash != self.end.hash {
unsafe {
let next = ptr::replace(&mut self.end, self.end.offset(-1));
if *next.hash != EMPTY_BUCKET {
return Some(next);
}
}
}

None
}
}

/// An iterator that moves out buckets in reverse order. It leaves the table
/// in an inconsistent state and should only be used for dropping
/// the table's remaining entries. It's used in the implementation of Drop.
Expand Down Expand Up @@ -785,6 +817,20 @@ impl<'a, K, V> Iterator<(&'a K, &'a V)> for Entries<'a, K, V> {
}
}

impl<'a, K, V> DoubleEndedIterator<(&'a K, &'a V)> for Entries<'a, K, V> {
fn next_back(&mut self) -> Option<(&'a K, &'a V)> {
self.iter.next_back().map(|bucket| {
self.elems_left += 1;
unsafe {
(&*bucket.key,
&*bucket.val)
}
})
}
}

impl<'a, K, V> ExactSize<(&'a K, &'a V)> for Entries<'a, K, V> {}

impl<'a, K, V> Iterator<(&'a K, &'a mut V)> for MutEntries<'a, K, V> {
fn next(&mut self) -> Option<(&'a K, &'a mut V)> {
self.iter.next().map(|bucket| {
Expand All @@ -801,6 +847,20 @@ impl<'a, K, V> Iterator<(&'a K, &'a mut V)> for MutEntries<'a, K, V> {
}
}

impl<'a, K, V> DoubleEndedIterator<(&'a K, &'a mut V)> for MutEntries<'a, K, V> {
fn next_back(&mut self) -> Option<(&'a K, &'a mut V)> {
self.iter.next().map(|bucket| {
self.elems_left += 1;
unsafe {
(&*bucket.key,
&mut *bucket.val)
}
})
}
}

impl<'a, K, V> ExactSize<(&'a K, &'a mut V)> for MutEntries<'a, K, V> {}

impl<K, V> Iterator<(SafeHash, K, V)> for MoveEntries<K, V> {
fn next(&mut self) -> Option<(SafeHash, K, V)> {
self.iter.next().map(|bucket| {
Expand All @@ -823,6 +883,25 @@ impl<K, V> Iterator<(SafeHash, K, V)> for MoveEntries<K, V> {
}
}

impl<K, V> DoubleEndedIterator<(SafeHash, K, V)> for MoveEntries<K, V> {
fn next_back(&mut self) -> Option<(SafeHash, K, V)> {
self.iter.next().map(|bucket| {
self.table.size += 1;
unsafe {
(
SafeHash {
hash: *bucket.hash,
},
ptr::read(bucket.key as *const K),
ptr::read(bucket.val as *const V)
)
}
})
}
}

impl<K, V> ExactSize<(SafeHash, K, V)> for MoveEntries<K, V> {}

impl<K: Clone, V: Clone> Clone for RawTable<K, V> {
fn clone(&self) -> RawTable<K, V> {
unsafe {
Expand Down