-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added binary index header implementation with benchmarks.
This PR adds index-header implementation based on [this design](https://thanos.io/proposals/201912_thanos_binary_index_header.md/) It adds a separate indexheader.Binary* structs and method allowing to build and read index-header in binary format. ## Stats Size difference: 10k series for my autogenerated data it's 2.1x -rw-r--r-- 1 bwplotka bwplotka 6.1M Jan 10 13:20 index -rw-r--r-- 1 bwplotka bwplotka 23K Jan 10 13:20 index.cache.json -rw-r--r-- 1 bwplotka bwplotka 9.2K Jan 10 13:20 index-header For realistic block 8mln series, also similar gain. -rw-r--r-- 1 bwplotka bwplotka 1.9G Jan 10 13:29 index -rw-r--r-- 1 bwplotka bwplotka 287M Jan 10 13:29 index.cache.json -rw-r--r-- 1 bwplotka bwplotka 122M Jan 10 13:29 index-header NOTE: Size is smaller, but it's not what we are trying to optimize for. Nevertheless PostingOffsets and Symbols takes significant amount of bytes. The only downsides of size is the fact that to create such index-header we have to fetch those two parts ~60MB each from object storage. Idea for improvement if that will become a problem: Cache only 32th of the posting ranges and fetch gaps between on demand on query time (with some cache). Real time latencies for creation and loading (without network traffic): For 10k block it's similar for both (ms/micros), for 8mln we can spot the difference: index-header: * write 134.197732ms * read 415.971774ms index-cache.json: * write 6.712496338s * read 6.112222132s ## Go Benchmarks: Before comparing I changed names to correlate tests: BenchmarkJSONReader-12-> BenchmarkRead-12 old BenchmarkBinaryReader-12 -> BenchmarkRead-12 new BenchmarkJSONWrite-12 -> BenchmarkWrite-12 old BenchmarkBinaryWrite-12 -> BenchmarkWrite-12 new ### 10k series block: benchmark old ns/op new ns/op delta BenchmarkRead-12 591780 66613 -88.74% BenchmarkWrite-12 2458454 6532651 +165.72% benchmark old allocs new allocs delta BenchmarkRead-12 2306 629 -72.72% BenchmarkWrite-12 1995 64 -96.79% benchmark old bytes new bytes delta BenchmarkRead-12 150904 32976 -78.15% BenchmarkWrite-12 161501 73412 -54.54% CPU time for smaller index file is interesting. Value is low anyway. Might be something to follow up. ### 8mln series (index takes 2GB so not committed to git): benchmark old ns/op new ns/op delta BenchmarkRead-12 7026290474 552913402 -92.13% BenchmarkWrite-12 6480769814 276441977 -95.73% benchmark old allocs new allocs delta BenchmarkRead-12 20100014 5501312 -72.63% BenchmarkWrite-12 18263356 64 -100.00% benchmark old bytes new bytes delta BenchmarkRead-12 1873789526 406021516 -78.33% BenchmarkWrite-12 2385193317 74187 -100.00% Signed-off-by: Bartlomiej Plotka <[email protected]>
- Loading branch information
Showing
17 changed files
with
1,376 additions
and
106 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.