-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate and document the most efficient write pattern for Level DB #13
Comments
LevelDB deletions do not actually get deleted right away. They write tombstones, and the actual deletions only happen on compaction. |
Records are applied sequentially: |
LevelDb is based on LSM-tree out of SSTables
Constructing and Maintaining SSTables
Leveled Compaction
|
Memtable should already be sorted. Therefore, we should check if it is necessary to sort fast node additions and removals in iavl tree before writing them to the db. Testing: Benchmark with sorting:
Benchmark without sorting:
Looks like the sorting in iavl does help. Potential explanation: The max default size of the memtable is 4MB which is 4 * 1024 * 1024 = 4,194,304 bytes The benchmark was done on 100,000 key-value pairs of 56 bytes = 5,600,000 bytes Therefore, level db would have to create one SSTable on disk after memtable fills up. After SSTable is written, a new memtable is created with extra data. Both are intra sorted but not inter sorted. Therefore we can conclude that sorting is indeed necessary for good performance with a large set of key-value pairs that outgrows the default size of memtable. |
Benchmark for writing removals first and only then additions:
Has only made it worse than before |
Combining them together and writing at once has also made the performance worse:
|
Based on the investigation above, my guesses were incorrect. The current write pattern introduced in #5 are the most efficient out of all proposed solutions |
Background
In #5 , we introduced unsaved additions and unsaved removals in
mutable_tree.go
. Since the fast index, introduces additional overhead on writes (which is normal for indexes), we would like to ensure that writes are being committed in the most optimal wayCurrently, writes to disk work in the following way (in
SaveVersion
):goleveldb
batchgoleveldb
batchgoleveldb
batch gets written to diskWhen we batch operations,
goleveldb
simply appends them to the batch. Then, on commit,goleveldb
batch sequentially applies them. Therefore, the above write pattern might not be the most efficient. We should investigate and document what write pattern would be the most efficient.Potential alternatives:
Acceptance Criteria
The text was updated successfully, but these errors were encountered: