Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki Query Frontend #1442

Merged
merged 21 commits into from
Jan 7, 2020
Merged

Conversation

cyriltovena
Copy link
Contributor

@cyriltovena cyriltovena commented Dec 19, 2019

After many refactoring PRs in Cortex, this PR introduces a query-frontend component in loki.
The query-frontend sits in front of queriers and communicates with them to dispatch work, allowing parallelization and queries speed improvement.

I have been running the frontend for a month in dev and it really shows promising result. However it definitely requires bigger queriers pool compared to Cortex.

The current implementation targets specifically slow queries.

Metrics queries

Metrics queries like cortex will be aligned, splitted by 4 hours by default and result cached. In fact we are using the same set of middlewares.For some streams 4 hours block can still be too much (high throughput), but I expect the sharding parallelization to split again those 4 hours block into 16 (Future works), in the meantime you can tweak the interval to a lower range.

Log Regex queries

Log regex queries are splitted also by interval of 4 hours, however compared to cortex they are run in sequence and each block is splitted by 16 (configurable) and run in parallel. This allows to check for every 4 hours if we already have enough result before moving to the next block.

Anything else (non regex log queries, labels, etc...) run the same way but still goes through the frontend. Except for labels names queries I actually don't plan any work since those queries are already very fast. (500~700ms)

The loki jsonnet library has been updated.

PS: I had to update to latest cortex and there were some changes related to the ring and some vendoring issue.

Next steps :

  • Add some documentation.
  • Implement log result cache. (tricky but should really improve resources consumption)
  • Add parallelization via sharding /cc @owen-d

/cc @slim-bean @tomwilkie @joe-elliott @rfratto

Here is a screenshot of a trace for a regex queries on an high throughput stream:
image

This is not yet part of the single binary, I still need to think about how we want to design this, it could be very nice to swap grpc for direct querier library calls.

@cyriltovena cyriltovena force-pushed the frontend-cleaned branch 3 times, most recently from 53170ee to 66dda4f Compare December 19, 2019 13:22
Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few questions, but it's looking great!

pkg/loki/loki.go Show resolved Hide resolved
pkg/loki/modules.go Show resolved Hide resolved
pkg/querier/queryrange/codec.go Outdated Show resolved Hide resolved
pkg/querier/queryrange/codec.go Outdated Show resolved Hide resolved
pkg/querier/queryrange/roundtrip.go Show resolved Hide resolved
pkg/querier/queryrange/split_by_interval.go Outdated Show resolved Hide resolved
production/ksonnet/loki/config.libsonnet Outdated Show resolved Hide resolved
Copy link
Contributor

@sandeepsukhani sandeepsukhani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a small change done which Owen pointed out already, otherwise it LGTM.

pkg/querier/queryrange/codec.go Outdated Show resolved Hide resolved
cyriltovena and others added 17 commits January 6, 2020 16:17
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
* frontend codec merging optimizations

* codec benchmarks

* removes unused bounds code in queryrange ordering

* [wip] splitby uses channels instead of sub batching intervals

* splitBy channel limit test

* single allocation for merging entries from a single stream

* skip merging loki responses when limit is already hit

* removes checks for unlimited queries in queryrange

* removes splitByInterval{,.interval} spans

* removes interval_batch_size from jsonnet lib

* moves benchmark utils to own file

* renames markers -> entries

* priority queue comments
Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's Get This Merged!

@cyriltovena
Copy link
Contributor Author

dam windows

@cyriltovena cyriltovena merged commit 4153740 into grafana:master Jan 7, 2020
@cyriltovena cyriltovena deleted the frontend-cleaned branch January 7, 2020 14:33
cyriltovena pushed a commit to cyriltovena/loki that referenced this pull request Jun 11, 2021
Fix the version of protobuf and pin tools in the build image.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants