Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement lazy rehashing for hashtable.
Previously, whenever we triggered a resize via cuckoo_fast_double or cuckoo_expand_simple, we would use all available cores to parallelize the rehashing. This is fine when the hashtable is the only thing running on the machine, but otherwise, it would interfere with other processes on the system. So instead, we implement a lazy-rehashing scheme, where the resize itself just allocates an empty buckets container of the new size, and then subsequent accesses into the table will perform any necessary rehashing before accessing the data they want. It is implemented by storing a flag in the locks array indicating whether or not we have migrated the buckets corresponding to this lock yet. And so whenever somebody needs to access data controlled by the lock, they'll need to rehash the lock if it hasn't already been done. Once all locks have been rehashed, we can free the old buckets array. Because we cannot always do lazy rehashing (data is not nothrow-move-constructible, explicit resize/rehash, making the table smaller, etc), we leave in the old functionality, but make the number of threads configurable, and defaulted to 0 extra threads. That way, if the user wants to make a particular operation use up more threads, they can set this explicitly in the table before doing the resize or lock_table. Some benchmarks show that across a variety of workloads and table types, there is a small overall drop in total throughput (around 2-7%) in a lot of categories. Expansion-heavy workloads, however, seem to be faster, possibly because the table allows running other operations while rehashing. Performance drop seems the greatest in the most fast-path-heavy workloads (pure read, int key, int val), where probably the extra few if statements would have a significant performance impact. Since the throughput drop is not very significant, and probably worst-case latency is a lot better, I think this is a change worth merging.
- Loading branch information