Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabled index-header under experimental flag. #1986

Merged
merged 1 commit into from
Jan 10, 2020

Conversation

bwplotka
Copy link
Member

@bwplotka bwplotka commented Jan 10, 2020

Depends on: #1952

Signed-off-by: Bartlomiej Plotka [email protected]

Enabled it also on all our tests.

Depends on: #1952

Signed-off-by: Bartlomiej Plotka <[email protected]>
@bwplotka bwplotka force-pushed the build-index-header-v3 branch from 74782f4 to 7ad49bf Compare January 10, 2020 19:47
Copy link
Member

@squat squat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update the doc? Otherwise looks 💯

@bwplotka
Copy link
Member Author

FYI @squat This is chained PR, chained against #1952 which has to be reviewed first to be merged (:

I think for the experimental flag it's fine to not have CHANGELOG. Doc was added to #1952, let me know there if something is missing...

@squat
Copy link
Member

squat commented Jan 10, 2020

Yes but I mean the generated document for the cli flags. That has to be merged simultaneously with this change, no?

@bwplotka
Copy link
Member Author

.. or after.

It's hidden flag, so CI is not crying for lack of generated docs.

@bwplotka
Copy link
Member Author

But still, some integration tests are failing, need to investigate after holidays (in week time) (:

Anyway, the #1952 should be ready for review.

@bwplotka
Copy link
Member Author

Actually let's merge one into another, will be easier to merge and see integration tests results.

@bwplotka bwplotka merged commit fb170a5 into build-index-header-v2 Jan 10, 2020
@bwplotka bwplotka deleted the build-index-header-v3 branch January 10, 2020 20:26
bwplotka added a commit that referenced this pull request Jan 11, 2020
Enabled it also on all our tests.

Depends on: #1952

Signed-off-by: Bartlomiej Plotka <[email protected]>
bwplotka added a commit that referenced this pull request Jan 11, 2020
Enabled it also on all our tests.

Depends on: #1952

Signed-off-by: Bartlomiej Plotka <[email protected]>
bwplotka added a commit that referenced this pull request Jan 20, 2020
Enabled it also on all our tests.

Depends on: #1952

Signed-off-by: Bartlomiej Plotka <[email protected]>
bwplotka added a commit that referenced this pull request Jan 21, 2020
* Added binary index header implementation with benchmarks.

This PR adds index-header implementation based on [this design](https://thanos.io/proposals/201912_thanos_binary_index_header.md/)

It adds a separate indexheader.Binary* structs and method allowing to build and read index-header in binary format.

Size difference:

10k series for my autogenerated data it's 2.1x

-rw-r--r-- 1 bwplotka bwplotka 6.1M Jan 10 13:20 index
-rw-r--r-- 1 bwplotka bwplotka  23K Jan 10 13:20 index.cache.json
-rw-r--r-- 1 bwplotka bwplotka 9.2K Jan 10 13:20 index-header

For realistic block 8mln series, also similar gain.

-rw-r--r-- 1 bwplotka bwplotka 1.9G Jan 10 13:29 index
-rw-r--r-- 1 bwplotka bwplotka 287M Jan 10 13:29 index.cache.json
-rw-r--r-- 1 bwplotka bwplotka 122M Jan 10 13:29 index-header

NOTE: Size is smaller, but it's not what we are trying to optimize for. Nevertheless
PostingOffsets and Symbols takes significant amount of bytes. The only downsides of size
is the fact that to create such index-header we have to fetch those two parts ~60MB each from object storage.
Idea for improvement if that will become a problem: Cache only 32th of the posting ranges and fetch gaps between on demand
on query time (with some cache).

Real time latencies for creation and loading (without network traffic):

For 10k block it's similar for both (ms/micros), for 8mln we can spot the difference:

index-header:

* write 134.197732ms
* read 415.971774ms

index-cache.json:

* write 6.712496338s
* read 6.112222132s

Before comparing I changed names to correlate tests:

BenchmarkJSONReader-12-> BenchmarkRead-12 old
BenchmarkBinaryReader-12 -> BenchmarkRead-12 new
BenchmarkJSONWrite-12 -> BenchmarkWrite-12 old
BenchmarkBinaryWrite-12  -> BenchmarkWrite-12 new

benchmark             old ns/op     new ns/op     delta
BenchmarkRead-12      591780        66613         -88.74%
BenchmarkWrite-12     2458454       6532651       +165.72%

benchmark             old allocs     new allocs     delta
BenchmarkRead-12      2306           629            -72.72%
BenchmarkWrite-12     1995           64             -96.79%

benchmark             old bytes     new bytes     delta
BenchmarkRead-12      150904        32976         -78.15%
BenchmarkWrite-12     161501        73412         -54.54%

CPU time for smaller index file is interesting. Value is low anyway. Might be
something to follow up.

benchmark             old ns/op      new ns/op     delta
BenchmarkRead-12      7026290474     552913402     -92.13%
BenchmarkWrite-12     6480769814     276441977     -95.73%

benchmark             old allocs     new allocs     delta
BenchmarkRead-12      20100014       5501312        -72.63%
BenchmarkWrite-12     18263356       64             -100.00%

benchmark             old bytes      new bytes     delta
BenchmarkRead-12      1873789526     406021516     -78.33%
BenchmarkWrite-12     2385193317     74187         -100.00%

Signed-off-by: Bartlomiej Plotka <[email protected]>

* Enabled index-header under experimental flag. (#1986)

Enabled it also on all our tests.

Depends on: #1952

Signed-off-by: Bartlomiej Plotka <[email protected]>

* Addressed comments.

Signed-off-by: Bartlomiej Plotka <[email protected]>

* Fixed small bug in resize.

Signed-off-by: Bartlomiej Plotka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants