-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
querier: Execute Selects concurrently per query #2657
Conversation
ae8a8b5
to
2e06c69
Compare
5242e45
to
031b017
Compare
c01bc9b
to
c566595
Compare
d6eeef0
to
3439ad6
Compare
d34cbbc
to
2e0a6cd
Compare
Just a flaky test. Needs a re-run. |
A test re-run would be awesome. |
done, reloader is really flaky nowadays |
13f4ed1
to
c611490
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one question, other than that LGTM
@@ -67,6 +67,9 @@ func registerQuery(m map[string]setupFunc, app *kingpin.Application) { | |||
maxConcurrentQueries := cmd.Flag("query.max-concurrent", "Maximum number of queries processed concurrently by query node."). | |||
Default("20").Int() | |||
|
|||
maxConcurrentSelects := cmd.Flag("query.max-concurrent-select", "Maximum number of select requests made concurrently per a query."). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default 4 feels like a bit arbitrary? maybe default to 1? as current default of maxConcurentQueries is 20, this might significantly increase respurce usage?
Maybe we can read cpu limits etc and autoconfigure those values based on available cpu / or cpu assigned to cgroups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now 1 and experimenting on our prods to figure out better default is the way to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's an arbitrary number. I've actually left a comment on the PR, probably got removed between force pushes :D That's actually a point I want to discuss.
Autoconfigure sounds good, we can try that.
However, I also suspect this will be I/O bounded and having 20x4 at worst case shouldn't hurt that much. That being said, maybe we can try to craft a benchmark to find the magic number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks solid, LGTM! 💪
Just I wish we had some micro benchmarks for this path.
I started some work on this here #2305 but actually for proxy part.
What we need here is some benchmark for Multiple selects on large dataset returned by underlying Store API.
If you are bored @kakkoyun or @krasi-georgiev this is some work that has to be done at some point, in separate PR to this I think (:
Signed-off-by: Kemal Akkoyun <[email protected]>
Signed-off-by: Kemal Akkoyun <[email protected]>
Signed-off-by: Kemal Akkoyun <[email protected]>
Signed-off-by: Kemal Akkoyun <[email protected]>
c611490
to
0c88266
Compare
Signed-off-by: Kemal Akkoyun <[email protected]>
0c88266
to
1c93494
Compare
Enable Querier to dissect a query into pieces and execute concurrently.
xref: prometheus/prometheus#7251
depends: #2748
Signed-off-by: Kemal Akkoyun [email protected]
Changes
Verification
maka test-local
make test-e2e
Tracing