Crate | # Downloads | Ranges | Element Type | Set Operations? | Internal | Maps, too? |
---|---|---|---|---|---|---|
range-set-blaze | 488* | Disjoint | Integer | Set Ops | BTreeMap | Only Sets |
roaring | 1,588,056* | Disjoint | u32 | SetOps | Compressed Bitmaps | Only Sets |
rangemap | 243,612 | Disjoint | Ord | No Set Ops | BTreeMap | Sets/Maps |
iset | 128,824 | Overlapping | PartialOrd | No Set Ops | Red Black | Sets/Maps |
theban_interval_tree | 72,000 | Overlapping(?) | ? | No Set Ops | interval tree | Sets/Maps |
Range-collections | 44,974 | Disjoint | Ord | Set Ops | SmallVec | Sets/Maps |
range-set | 19,526 | Disjoint | Ord? | No Set Ops | SmallVec | Only Sets |
sorted-iter | 19,183 | No | Ord | Set Ops | n/a | Sets/Maps |
ranges | 14,964 | Disjoint | 'Domain' | Set Ops | Vec | Only Sets |
unbounded-interval-tree | 3,780 | Overlapping | Ord | No Set Ops | Red Black | Only Sets |
ranged_set | 2,116 | Disjoint | Ord | No Set Ops | ? | Only Sets |
btree-range-map | 1,897 | Yes | Measure | No Set Ops | BTreeMap | Only Sets |
nonoverlapping_interval_tree | <1000 | Yes | ? | No Set Ops | BTreeMap | Only Sets |
segmap | <1000 | Yes | Ord | not tested | BTreeMap | Sets/Maps |
interval-map | <1000 | Yes | Ord | No Set Ops | sorted vec | Sets/Maps |
range_bounds_map | <1000 | Yes | Ord | No Set Ops | ? | Sets/Maps |
int_range_set | <1000 | Yes | u64 | No Set Ops | tinyvec | Only Sets |
The # of downloads as of March 2023
* The # of downloads as of June 30, 2023
I started by evaluating:
BTreeSet
andHashSet
from the standard libraryrange_map
, the most popular crate that works with ranges in a treeRange-collections
andrange-set
, the most popular crates that store ranges in a vector
I later added:
roaring
, a "compressed bitset" library with over a millions Rust downloads and versions in many other languages. It stores u32 values in 65K chunks of 65K values. Each chunk is represented as either a vector of integers, a vector of ranges, or a bitmap.
The range_map
, Range-collections
, range-set
, and roaring
crates store disjoint ranges. I eliminated crates that store overlapping ranges, a different data structure (iset
, theban_interval_tree
, and unbounded-interval-tree
).
Disjoint ranges can be stored in a tree or a vector. With a tree, we expect inserts to be much faster than with a vector, O(log n) vs O(n). Benchmark ingest_clumps_easy
below showed this to be true. Because I care about such inserts, after that benchmark, I removed vector-based crates from consideration except for roaring
.
Finally, I looked for crates that supported set operations (for example, union, intersection, set difference). Only roaring
of the remaining crates offered tested set operations. (The inspirational sorted-iter
also does, but it is designed to work on sorted values, not ranges, and so is not included.)
If I misunderstood any of the crates, please let me know. If you'd like to benchmark a crate, the benchmarking code is in the benches
directory of this repository.
These benchmarks allow us to understand the range-set-blaze::RangeSetBlaze
data structure and to compare it to similar data structures from other crates.
- Measure: intake speed
- Candidates:
HashSet
,BTreeSet
,Roaring
,RangeSetBlaze
- Vary: n from 1 to 10,000, number of random integers
- Details: Select n integers randomly and uniformly from the range 0..=999 (with replacement).
RangeSetBlaze
is consistently about 2.5 times slower than HashSet
. On small sets, Roaring
and BTreeSet
are in the middle, but BTreeSet
gets almost as slow as RangeSetBlaze
as the sets grow.
HashSet
, not RangeSetBlaze
, is a good choice for ingesting sets of non-clumpy integers. However, RangeSetBlaze
is not catastrophically bad; it is just 2.5 times worse.
See benchmark 'worst_op_blaze', near the end, for a similar comparison of set operations on uniform data.
- Measure: integer intake speed
- Candidates:
HashSet
,BTreeSet
,Roaring
,RangeSetBlaze
- Vary: average clump size from 1 (no clumps) to 100K (ten big clumps)
- Details: We generate 1M integers with clumps. We ingest the integers one at a time. Each clump has size chosen uniformly random from roughly 1 to double average clump size. (The integer clumps are positioned random uniform, with-replacement, in a span from 0 to roughly 10M. The exact span is sized so that the union of the 1M integers will cover about 10% of the span. In other words, a given integer in the span will have a 10% chance of being in one of the 1M integers generated.)
As before, with no clumps, RangeSetBlaze
is more than 2.5 times slower than HashSet
. Somewhere around clump size 3, RangeSetBlaze
becomes the best performer. As the average clump size goes past 100, RangeSetBlaze
is a steady 30 times faster than HashSet
, 15 times faster than BTreeSet
, and roughly 10 to 30 times faster than Roaring
.
The nightly-only RangeSetBlaze::from_slice
is even faster, as the clump size rises
it is more than 100 times faster than the alternatives.
If we are allowed to input the clumps as ranges (instead of as individual integers), then when the average clump size is 1000 RangeSetBlaze
is 640
times faster than HashSet
and BTreeSet
and more than 20 times faster than Roaring
.
Range-based methods such as RangeSetBlaze
and Roaring
are a great choice for clumpy integers. When the input is given as ranges, they are the only sensible choice.
- Measure: integer intake speed
- Candidates:
base
+ rangemap, - Vary: average clump size from 1 (no clumps) to 100K (ten big clumps)
- Details: As with
base
.
We give each crate the clumps as individual integers.
rangemap
is typically three times slower than HashSet
and 75 times slower than RangeSetBlaze
. The nightly RangeSetBlaze::from_slice
is even faster by an order of magnitude. However ...
RangeSetBlaze
batches its integer input by noticing when consecutive integers fit in a clump. This batching is not implemented in rangemap
but could easily be added to it or any other range-based crate.
Roaring
is 10 to 30 times slower than RangeSetBlaze
. I don't know if Roaring
exploits consecutive integers. If not, it could.
- Measure: range intake speed
- Candidates: RangeSetBlaze + rangemap + Roaring
- Vary: average clump size from 1 (no clumps) to 100K (ten big clumps)
- Details: As with
base
.
We give each crate the clumps as ranges (instead of as individual integers).
Over the clump sizes, RangeSetBlaze
averges about 3 times faster than rangemap
and 10 times faster than Roaring
, However ...
RangeSetBlaze
batches range inputs by sorting them and then merging adjacent ranges. This batching is likely not implemented in rangemap
or Roaring
but could easily be added to it or any other range-based crate.
- Measure: range intake speed
- Candidates: Tree based (RangeSetBlaze rangemap), Vector based (
range_collections
,range_set
), Compressed Bitsets (Roaring
) - Vary: average clump size from 1 (100K ranges) to 10 (10K ranges)
- Details: We generate 100K integers with clumps (down from 1M)
We give each crate the clumps as ranges (instead of as individual integers).
The fastest vector-based method is 14 times slower than the slowest tree-based method. It is 50 times slower than RangeSetBlaze
. This is expected because vector-based methods are not designed for a large numbers of inserts.
The hybrid method, Roaring
, does better than any method except RangeSetBlaze
.
- Measure: adding ranges to an existing set
- Candidates: RangeSetBlaze, rangemap, Roaring
- Vary: Number of clumps in the second set, from 1 to about 90K.
- Details: We first create two clump iterators, each with the desired number of clumps. Their integer span is 0..=99_999_999. Each clump iterator is designed to cover about 10% of this span. We, next, turn these two iterators into two sets. The first set is made from 1000 clumps. Finally, we measure the time it takes to add the second set to the first set.
RangeSetBlaze
uses a hybrid algorithm for "union". When adding a few ranges, it adds them one at a time. When adding many ranges, it
merges the two sets of ranges by iterating over them in sorted order and merging.
When adding one clump to the first set, RangeSetBlaze
is about 30% faster than rangemap
and 38 times faster than Roaring
.
As the number-of-clumps-to-add grows, RangeSetBlaze
automatically switches algorithms. This allows it to be 6 times faster than the rangemap
. Roaring
and RangeSetBlaze
use very similar union
algorithms when the number of clumps is large and get similar results.
Over the whole range of clumpiness, RangeSetBlaze
is faster because it uses a hybrid algorithm.
Benchmark #7a: 'every_op_blaze': Compare RangeSetBlaze
's set operations to each other on clumpy data
- Measure: set operation speed
- Candidates: union, intersection, difference, symmetric_difference, complement
- Vary: number of ranges in the set, from 1 to about 50K.
- Details: We create two clump iterators, each with the desired number of clumps and a coverage of 0.5. Their span is 0..=99_999_999. We, next, turn these two iterators into two sets. Finally, we measure the time it takes to operate on the two sets.
Complement (which works on just once set) is twice as fast as union, intersection, and difference. Symmetric difference is 2.9 times slower.
Under the covers, all the operations are implemented in terms of complement and union. The speed of each operation is a reflection of its complexity in terms of complement and union.
- Set up same as in #7a
Intersection is much faster than union. Complement is slowest because it is not defined by Roaring
but can be defined by the user as Universe - a_set
.
- Set up same as in #7a
When the number of ranges (or clumps) is very small, RangeSetBlaze
operates on the data 1000's of
times faster than Roaring
. As the number of clumps goes into the 100's and 1000's, it is still 15 to 200 times faster. When the number of ranges get even larger, it is slightly faster.
The plot shows the results for intersection, Roaring
's fastest operator on this data.
See benchmark 'worst_op_blaze', near the end, for a similar comparison of set operations on uniform data.
- Measure: intersection speed
- Candidates: 2-at-a-time intersection, multiway intersection (static and dynamic)
- Vary: number of sets, from 2 to 100.
- Details: We create n iterators. Each iterator generates 1,000 clumps. The iterators are designed such that the coverage of the final intersection is about 25%. The span of integers in the clumps is 0..=99_999_999. We turn the n iterators into n sets. Finally, we measure the time it takes to operate on the n sets.
On two sets, all methods are similar but beyond that two-at-a-time gets slower and slower. For 100 sets, it must create about 100 intermediate sets and is about 14 times slower than multiway.
Dynamic multiway is not used by RangeSetBlaze
but is sometimes needed by SortedDisjoint
iterators
(also available from the range-set-blaze
crate). It is 5% to 10% slower than static multiway.
- Measure: set intersection speed
- Candidates:
BTreeSet
,HashSet
,Roaring
,RangeSetBlaze
- Vary: n from 1 to 1M, number of random integers
- Details: Select n integers randomly and uniformly from the range 0..100,000 (with replacement). Create 20 pairs of set at each length.
Over almost the whole range Roaring
is best. Roughly 10 times better than RangeSetBlaze
when the number of intergers is less than 10,000. As the number of integers increases beyond 10,000 RangeSetBlaze
, BTreeSet
, and HashSet
continue to get slower. Roaring
, on the other hand, gets faster;
presumably as it switches to bitmaps.
Roaring
is a great choice when doing operations on u64 sets that may or may not be clumpy.
All four candidates offer similar interfaces. If you're not sure which is best for your application, you can easily swap between them and see.