-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receive] Export metrics about remote write requests per tenant #5424
Merged
bwplotka
merged 20 commits into
thanos-io:main
from
douglascamata:dcamata/thanos-receive-write-metrics
Jun 23, 2022
Merged
Changes from 18 commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
4e0a874
Add write metrics to Thanos Receive
douglascamata 81d03c3
Let the middleware count inflight HTTP requests
douglascamata 8878786
Update Receive write metrics type & definition
douglascamata 48d7ef0
Put option back in its place to avoid big diff
douglascamata d33957f
Fetch tenant from headers instead of context
douglascamata 33f4c6a
Delete unnecessary tenant parser middleware
douglascamata 3254efa
Refactor & reuse code for HTTP instrumentation
douglascamata 7f1cecb
Add missing copyright to some files
douglascamata 51b6d21
Add changelog entry for Receive & new HTTP metrics
douglascamata fe750d1
Merge branch 'main' of https://github.com/thanos-io/thanos into dcama…
douglascamata 643ffb9
Remove TODO added by accident
douglascamata 607c4e9
Make error handling code shorter
douglascamata 95a7a37
Make switch statement simpler
douglascamata 3b3592b
Remove method label from timeseries' metrics
douglascamata 2dd4592
Count samples of all series instead of each
douglascamata 0141ea9
Remove in-flight requests metric
douglascamata 493ede8
Change timeseries/samples metrics to histograms
douglascamata 26600c1
Merge branch 'main' of https://github.com/thanos-io/thanos into dcama…
douglascamata c4662b5
Fix Prometheus registry for histograms
douglascamata 77f7bf7
Fix comment in NewHandler functions
douglascamata File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
// Copyright (c) The Thanos Authors. | ||
// Licensed under the Apache License 2.0. | ||
|
||
package http | ||
|
||
import ( | ||
"net/http" | ||
|
||
"github.com/prometheus/client_golang/prometheus" | ||
) | ||
|
||
type tenantInstrumentationMiddleware struct { | ||
metrics *defaultMetrics | ||
tenantHeaderName string | ||
} | ||
|
||
// NewTenantInstrumentationMiddleware provides the same instrumentation as defaultInstrumentationMiddleware, | ||
// but with a tenant label fetched from the given tenantHeaderName header. | ||
// Passing nil as buckets uses the default buckets. | ||
func NewTenantInstrumentationMiddleware(tenantHeaderName string, reg prometheus.Registerer, buckets []float64) InstrumentationMiddleware { | ||
return &tenantInstrumentationMiddleware{ | ||
tenantHeaderName: tenantHeaderName, | ||
metrics: newDefaultMetrics(reg, buckets, []string{"tenant"}), | ||
} | ||
} | ||
|
||
// NewHandler wraps the given HTTP handler for instrumentation. It | ||
// registers five metric collectors (if not already done) and reports HTTP | ||
// metrics to the (newly or already) registered collectors: http_requests_total | ||
// (CounterVec), http_request_duration_seconds (Histogram), | ||
// http_request_size_bytes (Summary), http_response_size_bytes (Summary), | ||
// http_inflight_requests (Gauge). Each has a constant label named "handler" | ||
// with the provided handlerName as value. | ||
func (ins *tenantInstrumentationMiddleware) NewHandler(handlerName string, next http.Handler) http.HandlerFunc { | ||
tenantWrapper := func(w http.ResponseWriter, r *http.Request) { | ||
tenant := r.Header.Get(ins.tenantHeaderName) | ||
baseLabels := prometheus.Labels{"handler": handlerName, "tenant": tenant} | ||
handlerStack := httpInstrumentationHandler(baseLabels, ins.metrics, next) | ||
handlerStack.ServeHTTP(w, r) | ||
} | ||
return tenantWrapper | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
// Copyright (c) The Thanos Authors. | ||
// Licensed under the Apache License 2.0. | ||
|
||
package http | ||
|
||
import ( | ||
"github.com/prometheus/client_golang/prometheus" | ||
"github.com/prometheus/client_golang/prometheus/promauto" | ||
) | ||
|
||
type defaultMetrics struct { | ||
requestDuration *prometheus.HistogramVec | ||
requestSize *prometheus.SummaryVec | ||
requestsTotal *prometheus.CounterVec | ||
responseSize *prometheus.SummaryVec | ||
} | ||
|
||
func newDefaultMetrics(reg prometheus.Registerer, buckets []float64, extraLabels []string) *defaultMetrics { | ||
if buckets == nil { | ||
buckets = []float64{0.001, 0.01, 0.1, 0.3, 0.6, 1, 3, 6, 9, 20, 30, 60, 90, 120, 240, 360, 720} | ||
} | ||
|
||
return &defaultMetrics{ | ||
requestDuration: promauto.With(reg).NewHistogramVec( | ||
prometheus.HistogramOpts{ | ||
Name: "http_request_duration_seconds", | ||
Help: "Tracks the latencies for HTTP requests.", | ||
Buckets: buckets, | ||
}, | ||
append([]string{"code", "handler", "method"}, extraLabels...), | ||
), | ||
requestSize: promauto.With(reg).NewSummaryVec( | ||
prometheus.SummaryOpts{ | ||
Name: "http_request_size_bytes", | ||
Help: "Tracks the size of HTTP requests.", | ||
}, | ||
append([]string{"code", "handler", "method"}, extraLabels...), | ||
), | ||
requestsTotal: promauto.With(reg).NewCounterVec( | ||
prometheus.CounterOpts{ | ||
Name: "http_requests_total", | ||
Help: "Tracks the number of HTTP requests.", | ||
}, | ||
append([]string{"code", "handler", "method"}, extraLabels...), | ||
), | ||
responseSize: promauto.With(reg).NewSummaryVec( | ||
prometheus.SummaryOpts{ | ||
Name: "http_response_size_bytes", | ||
Help: "Tracks the size of HTTP responses.", | ||
}, | ||
append([]string{"code", "handler", "method"}, extraLabels...), | ||
), | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason behind in-flight? It's usually misleading metrics given 15s interval and requests finishing sooner.. but maybe we can try having it to double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bwplotka my attention adding the in-flight is to allow users of Thanos Receive to get better insights in situations of high load, where the number of concurrent requests will be relevant. Also, later on there will be a "metered gate" to limit the amount of in-flight requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually quite useful to know given atm there is no notion of enqueued requests in receiver. This is particularly useful for troubleshooting issues with load/receiver getting requests that it is not completing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@moadz FYI I am leaving the in-flight requests metric for a follow up PR to limit the scope of this one to Thanos Receive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split PR with in-flight requests metrics: #5440 .