Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: add snapshottable/persistent treemap variant #9816

Closed
wants to merge 6 commits into from

Conversation

bill-myers
Copy link
Contributor

This adds a new type called "SnapMap", which is a snapshottable treemap using copy-on-write and reference counting, in 3 versions, using Rc, Arc and ~ as pointers to the nodes.

This is essentially a persistent balanced tree data structure, but memory is not allocated on any modification, but insntead nodes are cloned when they have reference count > 1, so it behaves exactly like a non-persistent treemap if it is never cloned.

The first commits are the same as pull #9786, and they provide the mechanism to copy-on-write Rc and Arc by cloning the contents when the reference count is > 1.

Note that this is untested beyond compilation, and might well have serious bugs; it is based on treemap.rs, but has been modified to try to minimize useless copy-on-write cases: for instance, when one does a lookup to get a mutable reference, the code first does a normal lookup and records the tree path, and only splits the nodes if the key is indeed present, and the same thing is done on removes.

@huonw
Copy link
Member

huonw commented Oct 12, 2013

The functions shared by Rc, Arc and Own would be better to be a trait(s) which is then implemented on each, and then snapmap could possibly be implemented generically over this trait, rather than having 3 implementations for 3 different pointer types.

Note that this is untested beyond compilation, and might well have serious bugs

There is no chance of this landing until there are tests.

(Also, this needs a rebase on top of master.)

#[inline]
fn lt(&self, other: &SnapMap<K, V>) -> bool { lt(self, other) }
#[inline]
fn le(&self, other: &SnapMap<K, V>) -> bool { !lt(other, self) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ord has default methods that do exactly this so only lt is necessary.

@bill-myers
Copy link
Contributor Author

Regarding macros and traits, the problem is that it is seems higher-kind trait parameters (i.e. the ability to pass Rc itself and not Rc) would be needed, or some way to express recursive type aliases, i.e. type T = TreeNode<K, V, Rc<T>>

A trait for pointers would be good though even if macros are still needed.

The rest has been copied verbatim from treemap.rs

@huonw
Copy link
Member

huonw commented Oct 12, 2013

One can simulate higher-kinded traits with something like

trait CowPtr<T> {
    fn cow<'r>(&'r mut self) -> &'r mut T;
}

impl<T> CowPtr<T> for Arc<T> {
     // ...
}
impl<T> CowPtr<T> for ~T {
    // ...
}

(or whatever division of methods & traits is necessary.)

Then it would become something like

struct SnapMap<K, V, Ptr> {
    priv root: Option<Ptr>,
    priv length: uint
}

struct TreeNode<K, V, Ptr> {
    key: K,
    value: V,
    left: Option<Ptr>,
    right: Option<Ptr>,
    level: uint
}

impl<K: TotalOrd, V, Ptr: CowPtr<TreeNode<K, V, Ptr>>> 
     MutableMap<K, V> for SnapMap<K, V, Ptr> {
    // ...
}

(The above is untested, so I've got no idea if it would work.)

The rest has been copied verbatim from treemap.rs

There's always room for improvement, and its easier to make the code nicer at the start than to try to refactor it later.

priv length: uint
}

impl<K: Eq + TotalOrd + $R1 + $R2 + $R3, V: Eq + $R1 + $R2 + $R3> Eq for SnapMap<K, V> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like with your relaxation of the kind bounds on Clone for Arc, are the $Rx necessary here? (And on most of the other impls.)

@bill-myers
Copy link
Contributor Author

In your example, what do you use as the Ptr type parameter value in the Rc case?

The problem is that it would be something like type Ptr = Rc<TreeNode<K, V, Ptr>>, but I'm not sure if such a recursive type can be declared in Rust.

The compiler gives me the error "error: illegal recursive type; insert an enum in the cycle, if this is desired" for that line (despite the fact that there is in fact an enum there, namely Option).

Maybe there's another way or it's possible to easily fix the compiler to allow that declaration, not sure.

@bill-myers
Copy link
Contributor Author

Implemented all suggestions.

Also, changed code to replace treemap instead of adding SnapMap along it, since it seems it would be unmaintainable to have both (e.g. all the issues noticed in this are present in treemap too), and snapmap is supposed to reduce to treemap in the owned pointer case (after aggressive inlining and of course constant propagation).

@thestinger
Copy link
Contributor

Have you benchmarked the performance with test/bench/core-map.rs? Reusing the same code like this would be great, assuming it performs well.

It might be a bit problematic that it will make moving treemap to libstd require moving arc too. I think in the long-term we're also not going to want a red-black tree for the mutable case (#4992) and a wide fan-out won't mix well with persistent data structures based on reference counting.

@huonw
Copy link
Member

huonw commented Oct 13, 2013

@bill-myers looks like a rebase went wrong.

Shorter and same name used by Arc
The try_ functions don't require Clone, but they do nothing if rc > 1

The try_unwrap and value take and return by-value instead of &mut
They aren't needed, since arcs without the kinds can't be created
with safe code, and unsafe code should be allowed to do whatever it
wants.

This greatly simplifies libraries using Arc<T> who no longer need
to impose those traits unless they construct arcs.
@alexcrichton
Copy link
Member

I think that along the lines of #9653, this is really cool, but may not have it's place in libextra right now. The fate of libextra has been thought of to get split up in favor of a "package incubator" type situation. If we don't end up merging this, I would highly recommend making this a separate rustpkg repo and placing it on http://hiho.io/rust-ci/ because I'm sure that others would find this useful!

@alexcrichton
Copy link
Member

For similar reasons to #9653, I'm going to close this. This is certainly an awesome data structure, but this doesn't belong in libstd and libextra is on its way to getting dissolved. For now, I highly recomment putting this on RustCI, because I'm sure others would benefit from it!.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants