Shard ingester queries. #3852

cyriltovena · 2021-06-14T15:45:02Z

This is still experimental but already yield from 2x to 6x faster for short period queries.

I'm still playing with it but I want to share how I do it early.

Signed-off-by: Cyril Tovena [email protected]

This is still experimental but already yield from 2x to 6x faster for short period queries. I'm still playing with it but I want to share how I do it early. Signed-off-by: Cyril Tovena <[email protected]>

cyriltovena · 2021-06-15T10:13:13Z

This is promising, I'm getting exceptional result in our biggest cluster.

Signed-off-by: Cyril Tovena <[email protected]>

This PR introduces Series Queries Sharding. It does not check the boundaries of ingesters data since I'm assuming grafana#3852 will be merge first. Signed-off-by: Cyril Tovena <[email protected]>

owen-d

I've added some suggestions which I tested locally but didn't want to push to your branch. They are based off my fork of this PR.

This does two things:

Fixes the hashing calculations - notably maintaining that the internal hash space must be consistent. i.e. moving from 2 buckets to 4, the first bucket (0), covers the first 50% of the hash ring. Moving this same space into 4 buckets should actually map shards (0_of_2) -> (0_of_4, 1_of_4) instead of (0_of_2) -> (0_of_4, 2_of_4).
Adds logic for mapping any schema sharding factor into the relevant superset [shard] of any inverted index sharding factor. Notably, this would allow us to decouple the two. However, because the shard factors wouldn't need to be mutually divisible any more, we'd need to amend the Lookup, LabelNames, and LabelValues to also filter any returned values, ensuring we filter out any not belonging to the desired schema shard. Because this code is currently protected by the validateShard function, we can safely defer this decision for now.

One thing which I didn't include but expect we'll need is to align the hash functions used in the Cortex Index with the one used here. Notably, this is not exposed from cortex and is currently hardcoded into the Entries implementations. Since the chunk store is deprecated at this point, I expect we can copy that as it's not subject to change. The reason why we'll need to agree on a specific hash function here is because the querier queries both the store and the ingester for a particular shard. Our queries will quickly prove incorrect when series belong to different shards on each of these query paths.

pkg/ingester/index/index_test.go

pkg/ingester/index/index.go

cyriltovena · 2021-06-15T18:29:46Z

Our queries will quickly prove incorrect when series belong to different shards on each of these query paths.

Not sure about that.

cyriltovena · 2021-07-01T06:25:51Z

So I have an idea :) I'm going to change sharding in the store to align it with ingester see you soon !

Co-authored-by: Owen Diehl <[email protected]>

Utimately we should have a storage that relies on fingerprint, but that's harder to change. Signed-off-by: Cyril Tovena <[email protected]>

Signed-off-by: Cyril Tovena <[email protected]>

cyriltovena · 2021-07-01T09:06:07Z

@owen-d I think this is ready I've aligned the shard computation in the ingester. This means they use the same type of hash. I would have prefer to keep it based of the fingerprint but it's hard to change in the storage.

Something to keep in mind when we redesign the storage.

Signed-off-by: Cyril Tovena <[email protected]>

owen-d

Couple nits, then LGTM! We'll need to eventually add a way to either enforce that schema configs are multiples of indexShards or find a way to refactor the inverted index to work with any arbitrary sharding factor.

pkg/ingester/index/index.go

pkg/ingester/index/index_test.go

pkg/ingester/instance.go

Co-authored-by: Owen Diehl <[email protected]>

cyriltovena · 2021-07-07T07:54:58Z

Couple nits, then LGTM! We'll need to eventually add a way to either enforce that schema configs are multiples of indexShards or find a way to refactor the inverted index to work with any arbitrary sharding factor.

I don't want to invest too much in this yet. Not sure what is the future of that code.

Signed-off-by: Cyril Tovena <[email protected]>

* Shards Series API. This PR introduces Series Queries Sharding. It does not check the boundaries of ingesters data since I'm assuming #3852 will be merge first. Signed-off-by: Cyril Tovena <[email protected]> * Fixes tests sorting. Signed-off-by: Cyril Tovena <[email protected]>

Shard ingester queries.

70fab1c

This is still experimental but already yield from 2x to 6x faster for short period queries. I'm still playing with it but I want to share how I do it early. Signed-off-by: Cyril Tovena <[email protected]>

pull-request-size bot added the size/XL label Jun 14, 2021

Add notice of the code origin.

5943288

Signed-off-by: Cyril Tovena <[email protected]>

cyriltovena mentioned this pull request Jun 15, 2021

Shards Series API. #3856

Merged

owen-d reviewed Jun 15, 2021

View reviewed changes

pkg/ingester/index/index_test.go Outdated Show resolved Hide resolved

pkg/ingester/index/index.go Outdated Show resolved Hide resolved

cyriltovena and others added 4 commits July 1, 2021 08:26

Update pkg/ingester/index/index.go

5f0e245

Co-authored-by: Owen Diehl <[email protected]>

Update pkg/ingester/index/index_test.go

3d10b11

Co-authored-by: Owen Diehl <[email protected]>

Align shards from ingester and storage.

2dffd5a

Utimately we should have a storage that relies on fingerprint, but that's harder to change. Signed-off-by: Cyril Tovena <[email protected]>

Remove comment

c8a76d1

Signed-off-by: Cyril Tovena <[email protected]>

cyriltovena marked this pull request as ready for review July 1, 2021 08:53

Merge remote-tracking branch 'upstream/main' into shard-ingester

e999e5a

cyriltovena added 3 commits July 1, 2021 13:53

Fixes delete index func

2340488

Signed-off-by: Cyril Tovena <[email protected]>

Test reverting.

71d9168

Signed-off-by: Cyril Tovena <[email protected]>

Fixes a bug causing non deterministic hash.

4a81a15

Signed-off-by: Cyril Tovena <[email protected]>

owen-d approved these changes Jul 6, 2021

View reviewed changes

cyriltovena and others added 4 commits July 7, 2021 09:51

Update pkg/ingester/index/index.go

1c172cb

Co-authored-by: Owen Diehl <[email protected]>

Update pkg/ingester/index/index.go

79ea3bc

Co-authored-by: Owen Diehl <[email protected]>

Update pkg/ingester/index/index.go

38c01ac

Co-authored-by: Owen Diehl <[email protected]>

Update pkg/ingester/index/index_test.go

d1817b5

Co-authored-by: Owen Diehl <[email protected]>

cyriltovena added 2 commits July 7, 2021 10:02

Fixes build.

757bb6a

Signed-off-by: Cyril Tovena <[email protected]>

got linted :(

95e6f79

Signed-off-by: Cyril Tovena <[email protected]>

cyriltovena merged commit 97912b6 into grafana:main Jul 7, 2021

james-callahan mentioned this pull request Aug 7, 2021

Upgrade to 2.3.0 BitGo/kustomize-loki#11

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shard ingester queries. #3852

Shard ingester queries. #3852

cyriltovena commented Jun 14, 2021

cyriltovena commented Jun 15, 2021

owen-d left a comment •

edited

Loading

cyriltovena commented Jun 15, 2021

cyriltovena commented Jul 1, 2021

cyriltovena commented Jul 1, 2021

owen-d left a comment

cyriltovena commented Jul 7, 2021

Shard ingester queries. #3852

Shard ingester queries. #3852

Conversation

cyriltovena commented Jun 14, 2021

cyriltovena commented Jun 15, 2021

owen-d left a comment • edited Loading

Choose a reason for hiding this comment

cyriltovena commented Jun 15, 2021

cyriltovena commented Jul 1, 2021

cyriltovena commented Jul 1, 2021

owen-d left a comment

Choose a reason for hiding this comment

cyriltovena commented Jul 7, 2021

owen-d left a comment •

edited

Loading