implement a B-tree #4992

thestinger · 2013-02-17T02:17:40Z

A B-tree stores sorted arrays of a chosen size instead of lone nodes, so it can be much more cache-friendly than a red-black tree when used with small key/value types. The size will just have to be hardcoded for now since it can't be part of parametrizing it.

duckmatt · 2013-05-09T12:13:32Z

Hi, I'd like to give this an attempt, is this going to be treemap.rs reimplemented as a B-tree?

thestinger · 2013-05-09T16:19:29Z

@duckmatt: yeah, it would just expose the same interface as treemap

it could start off as another module and eventually replaced the treemap module

duckmatt · 2013-06-04T10:41:10Z

Just to give an update, I haven't disappeared. I'm hoping to get the btree fully functional and uploaded to my github this weekend or next, then the week or so after adapt to the treemap interface, been slow progress as I'm learning the language at the same time :p

duckmatt · 2013-07-09T16:47:33Z

I tried to get a btree working with owned boxes for the nodes, but found it can't work as I need to maintain a list of mutable nodes whilst walking down the tree. Seems the way around this is to either restrict the tree to use an even order, which I guess wouldn't be a problem if the order is always going to be fixed, or use reference counting for the nodes, which i assume will make it a bit less efficient. What would you go for between these?

alexcrichton · 2013-07-14T16:45:56Z

@duckmatt the general trend for containers in the standard library has been to avoid @ as much as possible. It forces users of the containers to have GC turned on, which may not always be desirable. If you can avoid @, then definitely do so!

smvv · 2013-07-14T17:16:34Z

I'm working on a b-tree implementation as well. @duckmatt Perhaps we can combine our efforts and work together on the implementation?

My implementation is nowhere near complete. At the moment, the insert method does split the node when the node is full. However, the node is not properly inserted into its parent node. See test_insert_split_add_root() for a test case that currently failed because of this.

I think I chose a proper data structure layout to start with:

separate keys from leaf/node pointers to speed up key matching.
there are n nodes and n - 1 keys. With key-value pairs (where value is a node- or value pointer), it is not possible to have n-1 keys. Having n keys implies that the key has to be copied, or is set to None. I thought that an extra key in every node would be wasteful, if the data structure layout is also possible with n-1 keys.
trees are owned and there are no copies of keys or values necessary during the insert.
stack-allocated, fixed-sized vectors to store the keys and node-/value pointers.

At the moment, BTREE_ORDER is a constant. There was a discussion on IRC about if it was possible to make it a trait parameter, but that's not possible with the current Rust compiler.

Comments are more than welcome. I started this b-tree implementation as an exercise to learn Rust, but I think we could turn our efforts into a usable b-tree implementation suitable for the Rust standard library.

Source: https://github.com/smvv/rust-btree/blob/master/btree.rs
Docs: http://smvv.io/rust-btree/doc/

@duckmatt Btw, my nickname is smvv on the Rust IRC channel #rust.

duckmatt · 2013-07-14T18:02:27Z

@smvv Looks good, that's pretty much where I got to so I figure I'll just make contributions to your code from now on and we can try to get that working.

Inserting into a parent is where I found the problem to be, because if the parent is full too then you need to split the parent, then insert into the parent of the parent, and so on until you reach the root or a node that's not full. So to implement this it seems like you need to build a list of mutable nodes as you walk down the tree, you'll add to this every time you come across a full node and clear it each time you come across a not full node. Problem is with this approach you then end up with two mutable reference to the same owned box. The alternative is to either use some interesting flow of data with recursion if it's possible, or to do the splits while walking down the tree (which is only possible with an even order, but I see you're asserting for that already anyway, so this may be the way to go)

smvv · 2013-07-14T18:52:23Z

TreeMap uses a recursive approach: https://github.com/mozilla/rust/blob/master/src/libextra/treemap.rs#L589 I think a top-down algorithm would be useful indeed.

smvv · 2013-07-22T18:38:56Z

The b-tree successfully inserts 100,000 randomly generated uints, and checks if the key is in the tree at the end. Inserting 100,000 uints followed by 100,000 lookups takes 50 ms on an intel i7 950 at 3.07 GHz. Without the lookups, inserting 100,000 uints takes 40ms. At the moment, removing keys is not implemented, and the b-tree will fail if a duplicated key is inserted (see below).

It is important to note that the insert() method does not return true / false for a new / old key. The reason is that the top-down algorithm cannot determine if a key does already exist. If a key is stored in a leaf, the top-down algorithm will discover the uniqueness once it's at the leaf node. However, in order to insert something in a leaf, the top-down algorithm has to split full nodes before it is at the leaf.

The question is whether it is useful to return true for a new key that is not inserted before. If it is important, insertion has to perform a find() prior to the splitting. How about defining two methods: one that performs a find() operation for uniqueness prior to insertion, and one for inserting only unique values?

@thestinger @duckmatt What is your opinion about this?

I chose 20 as minimal degree for the b-tree. It gave a good result for inserting 100,000 and 1,000,000 random uints. I compared it with 40 and 50, but insertion took longer (plus ~10ms for 100,000) for those values. I'm not sure what the optimal value will be.

This top-down algorithm is used: http://staff.ustc.edu.cn/~csli/graduate/algorithms/book6/chap19.htm

Source: https://github.com/smvv/rust-btree/blob/master/btree.rs

duckmatt · 2013-07-22T21:07:20Z

@smvv What I was doing in my implementation was the search to find where in the current node the position of the child to walk down to was before splitting, I did this with a binary search that returned an enum of either Match(position) or NoMatch(position), position = closest key if it's not a match, if it's a match then I knew it already existed. The thing is, the way I see it you would have to do this search anyway, as you would need to know where in the split nodes to walk down to/insert the new key/value into

Here's the bsearch I made if it's of any use: https://github.com/duckmatt/rustbinarysearch/blob/master/bsearch.rs

smvv · 2013-07-25T12:54:36Z

I did this with a binary search that returned an enum of either Match(position) or NoMatch(position), position = closest key if it's not a match, if it's a match then I knew it already existed.
The thing is, the way I see it you would have to do this search anyway, as you would need to know where in the split nodes to walk down to/insert the new key/value into

I'll give it a try, because I think that would be possible indeed. While walking down, you can find the key in a internal node or a leaf. There should be no need for a second find operation. Although, it will cause a split operation on a node when the node is full, and the key exists already in the b-tree. This is not really an issue because the properties of the b-tree still hold.

Your comment gave me another idea: The b-tree uses binary search for finding a leaf, but it does a linear search on the keys in a node. As an experiment, I used binary search on the keys in a node to see if binary search is faster than linear search. The idea behind the experiment is that the run-time increases when the number of keys per node increases. Unfortunately, using binary search on the keys did not reduce the run-time. I tried larger number of keys as well (# of keys I tried is {20, 40, 50}).

You can find the disabled binary search here: https://github.com/smvv/rust-btree/blob/master/btree.rs#L142

And important to note is: Using btree.rs as a library does not work: see #8044.

thestinger · 2013-07-25T14:53:20Z

@smvv: You could try using a galloping search instead. You search by stepping forwards by N where N doubles every time (accessing the indices 0, 1, 3, 7, 15, ...) until you overshoot, and then do a binary or linear search over the range from the last step.

thestinger · 2013-07-25T15:11:30Z

Also, a linear growth pattern might end up being faster (accessing 0, 1, 3, 6, 10, 15) since the fanout size is a known, small constant.

davidhalperin · 2013-07-27T17:34:23Z

@smvv Your structure doesn't seem right to me at a glance. A B-Tree typically stores both values and children in internal nodes. A B+ Tree stores either values or children but requires copying keys and you have claimed that you are not doing so. I haven't analyzed this in great detail so it may or may not work and may or may not have good properties but it doesn't seem to be textbook.

I'd also like to give this a try and have a very different representation in mind. Would anyone object if I gave this a try in parallel? I don't want to steal anyone's thunder.

The representation I have in mind is here. Nice things about this are:

Things that we know are there aren't wrapped in options.
The invariant that there's always one more branch than key/value is enforced by the type.
Leaves are compact and fast to iterate through.
Size of the nodes is tunable in the constructor.

The main cons I see are:

Keys in branches are relatively far apart might be slower to scan.
2 allocations in each branch node.
Special case for dealing with the far right child of a branch.

duckmatt · 2013-07-27T18:46:06Z

@davidhalperin The problem I see with the b+tree here is that you're going to have two references to an owned box, the right pointer will be pointing to a node that's already owned by a vector. Or are those nodes in the vectors not owned boxes? I'm getting a feeling that's where I went wrong

davidhalperin · 2013-07-27T19:34:17Z

@duckmatt You could use a borrowed pointer with a lifetime but then it wouldn't be sendable or you could use some unsafe code. I think DList uses unsafe pointers for backpointers so it'll be sendable.

I'm not sure what we want is a B+ Tree, though. I suspect @thestinger was inspired to open this issue by Google's C++ btree implementation which is faster than the normal C++ std::map. Their implementation stores an array of key value pairs in every node. Other that Google's implementation, I'm not aware of B-Trees being used for in memory data structures so I don't know if there's any data on whether a B+ Tree would be advantageous for this. If you have any please correct me. Either way, I think it would be better to start out with something simple, get it working, then try to optimize the representation after we have some tests and benchmarks in place as a baseline.

smvv · 2013-07-27T19:56:53Z

If you draw the b-tree with the current implementation, you'll get something like this:

                                                                +---------+
                                                                |N(key=15)|
                                                                +-+-----+-+
                                                                  |     |
                                             +--------------------+     +-------------------------+
                                             |                                                    |
                                         +---+----+                                           +---+----+
                                         |N(key=7)|                                           |N(key= )|
                                         +-+----+-+                                           +-+----+-+
                                           |    |                                               |    |
                      +--------------------+    +-------------------+                       +---+    +---+
                      |                                             |                       |            |
                  +---+----+                                    +---+----+              +---+-----+  +---+----+
                  |N(key=3)|                                    |N(key= )|              |N(key=11)|  |N(key= )|
                  +-+----+-+                                    +-+----+-+              +-+-----+-+  +-+----+-+
                    |    |                                        |    |                  |     |      |    |
          +---------+    +-------+                      +---------+    +-------+          +     +      +    +
          |                      |                      |                      |         ...   ...    ...  ...
      +---+----+             +---+----+             +---+----+             +---+----+
      |N(key=1)|             |N(key= )|             |N(key=5)|             |N(key= )|
      +-+----+-+             +-+----+-+             +-+----+-+             +-+----+-+
        |    |                 |    |                 |    |                 |    |
     +--+    +---+          +--+    +---+          +--+    +---+          +--+    +---+
     |           |          |           |          |           |          |           |
 +---+----+  +---+----+ +---+----+  +---+----+ +---+----+  +---+----+ +---+----+  +---+----+
 |L(key=0)|  |L(key= )| |L(key=2)|  |L(key= )| |L(key=4)|  |L(key= )| |L(key=6)|  |L(key= )|
 +--------+  +--------+ +--------+  +--------+ +--------+  +--------+ +--------+  +--------+

http://www.asciiflow.com/#Draw7058515904656888666/126940731

Perhaps this is not a textbook b-tree, but it does not require copying the keys. If the key is not present, the key is stored in its parent. If the direct parent does not have the key, the key is stored in the parent of the parent.

This is a 2-3-4 b-tree (BTREE_MIN_DEGREE = 2).

If I understand generics and Rust's enums correctly, a Leaf points directly to its value in the current implementation. There's no pointer between the node's pointer and the actual value. Thus L(key=0) contains the value, not a pointer to the value. This means that there's no overhead for leaf nodes (normally b-tree will use a smaller struct for leafs to reduce the memory footprint).

niftynif · 2013-10-24T20:02:21Z

I'm taking a look at this issue right now, sort of starting from scratch, but keeping in mind everything that has been posted so far with hopes of having a skeleton implementation to present before too long. I am reading carefully over past contributions to this issue and trying to take everything into account. I'm still a bit of a Rust beginner so any advice would be appreciated! Perhaps this is an ambitious first project but I am excited to see what I come up with. I feel fairly confident with data structures, so this is a place where I felt comfortable starting.

catamorphism · 2013-10-24T21:28:10Z

@niftynif Great! Honestly, I suspect starting from scratch would be a faster way to get started than trying to revive the existing code (of course, reading over it doesn't hurt). Rust has changed somewhat in the past three months. I don't mean to slight the people who worked on this issue before; it's a pretty normal thing for somebody to start fresh when another contributor isn't able to land their patch for any reason.

Drop by IRC if you have questions!

jruderman · 2013-11-10T13:05:51Z

Should this be closed now that #10303 from @niftynif has landed?

catamorphism · 2013-11-10T18:20:38Z

@jruderman Not yet. While @niftynif will hopefully be working on it more, #10303 represents work in progress.

berleon · 2013-11-14T23:04:01Z

@niftynif You might have a look at https://github.com/berleon/libtrees . I working on implementing a Lehmann and Yao concurrent B-Tree. As my plan is a disk based implementation, I have to consider locking and reading/writing nodes from/to hard disk. This all adds a lot of overhead. If you have any questions or suggestions, feel free to talk to me.

ghost · 2014-01-02T08:31:12Z

I have been working on a btree in rust independently of @niftynif and I am wondering if it could be of some help with this issue. In design it is much closer to @smvv implementation then to the design that is being implemented by niftynif. That is that leaf nodes are the only nodes that hold data, and branch nodes are key and pointer data only. It's much more complete then either, as it implements all of MutableMap and Iterator interface.

https://github.com/csherratt/cow-rs/blob/master/src/cow/btree.rs

There is a few problem areas, mostly around the fixed sized arrays inside of the nodes which in the current state of rust is difficult to avoid (std::mem::sizeof would need to be compile time, not run time evaluable).

niftynif · 2014-01-02T18:09:07Z

Great, it's really good to know that others are working on this issue currently. I definitely am not as far along as you are, as you can see from the work I've completed so far. It looks like you're implementing a B+-tree rather than the typical B-tree paradigm, which is pretty cool.

ghost · 2014-01-02T19:31:45Z

Well, a B+tree requires next pointers which my implementation lacks. There are some advantages to a real B-tree over what I have done. The largest being that you should not have to clone any of the keys which is a real improvement if the keys are a string or any other allocated data structure.

I'm interested to see how performance would compare. In a b+tree the internal nodes are more compact since they lack data values. This should make it faster to search for trees that do not fit in the L1 cache. On the other hand, having data in the internal nodes means that a b-tree will have many opportunities for early outs. Comparing search performance of my btree to a binary tree like treemap I found that treemap was faster up until the tree size grew to about 1000 keys.

pnkfelix · 2014-02-13T18:37:37Z

Need not block 1.0. Assigning P-low.

sinistersnare · 2014-07-26T03:21:20Z

is this solved? we have std::collections::btree:BTree in tree.

thestinger · 2014-07-26T03:47:17Z

It's not actually in a useful state yet.

aturon · 2014-09-23T23:33:52Z

cc @gankro

Gankra · 2014-09-27T19:19:19Z

We now have a decent BTreeMap! It should absolutely be optimized and cleaned up more, but it's been designed with that future work in mind.

Rustup to rust-lang#67853 Specifically caused by rust-lang#67786 changelog: none

thestinger mentioned this issue Oct 12, 2013

RFC: add snapshottable/persistent treemap variant #9816

Closed

pnkfelix added P-low and removed P-high-untriaged labels Feb 13, 2014

thestinger closed this as completed Sep 27, 2014

bors added a commit to rust-lang-ci/rust that referenced this issue May 2, 2020

Auto merge of rust-lang#4992 - phansch:rustup_foobar, r=matthiaskrgr

05b4603

Rustup to rust-lang#67853 Specifically caused by rust-lang#67786 changelog: none

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement a B-tree #4992

implement a B-tree #4992

thestinger commented Feb 17, 2013

duckmatt commented May 9, 2013

thestinger commented May 9, 2013

duckmatt commented Jun 4, 2013

duckmatt commented Jul 9, 2013

alexcrichton commented Jul 14, 2013

smvv commented Jul 14, 2013

duckmatt commented Jul 14, 2013

smvv commented Jul 14, 2013

smvv commented Jul 22, 2013

duckmatt commented Jul 22, 2013

smvv commented Jul 25, 2013

thestinger commented Jul 25, 2013

thestinger commented Jul 25, 2013

davidhalperin commented Jul 27, 2013

duckmatt commented Jul 27, 2013

davidhalperin commented Jul 27, 2013

smvv commented Jul 27, 2013

niftynif commented Oct 24, 2013

catamorphism commented Oct 24, 2013

jruderman commented Nov 10, 2013

catamorphism commented Nov 10, 2013

berleon commented Nov 14, 2013

ghost commented Jan 2, 2014

niftynif commented Jan 2, 2014

ghost commented Jan 2, 2014

pnkfelix commented Feb 13, 2014

sinistersnare commented Jul 26, 2014

thestinger commented Jul 26, 2014

aturon commented Sep 23, 2014

Gankra commented Sep 27, 2014