-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exemplar: Can not set partial response. #4676
Comments
if I add a flag and pass it to NewExemplarsHandler, https://github.com/thanos-io/thanos/blob/main/pkg/api/query/v1.go#L798, thanos-query run OOM when query. may be too many exemplars? could we set a number limit to it? pprof heap is here
|
So actually partial response in the Exemplars API works, right? @hanjm Can you help confirm this? If the flag works, then let's rename the issue to discuss the exemplars limit. |
@yeya24 I found seems missing exemplar.partial-response flag, so i add exemplar.partial-response flag in hanjm@681608c, Partial response works in my brach. |
I see. @hanjm Looks like we have this config in the struct but forget to add a flag for it. Would you like to open a pr for it? |
ok. |
I am investigate it. then i add a debug log to (*exemplarsServer).Send https://github.com/thanos-io/thanos/blob/main/pkg/exemplars/exemplars.go#L42 func (srv *exemplarsServer) Send(res *exemplarspb.ExemplarsResponse) error {
if res.GetWarning() != "" {
err := errors.New(res.GetWarning())
log.Printf("err message size: errors:%d, srv.warnings:%d, res.GetWarning():%d",
len(err.Error()),
len(srv.warnings),
len(res.GetWarning()))
srv.warnings = append(srv.warnings, err)
return nil
} it will print a lot of logs like
|
then i print svr.warnings first ten message. func (srv *exemplarsServer) Send(res *exemplarspb.ExemplarsResponse) error {
if res.GetWarning() != "" {
err := errors.New(res.GetWarning())
if len(srv.warnings) == 100 {
log.Printf("err message size: errors:%d, srv.warnings:%d, res.GetWarning():%d, srv.warnings:%+v",
len(err.Error()),
len(srv.warnings),
len(res.GetWarning()),
srv.warnings[:10],
)
}
srv.warnings = append(srv.warnings, err)
return nil
} it print a log like
|
Seems it is better if keep one |
Hi. The version of thanos has been rolled back to 0.22. In addition, in the current version, --query-frontend.downstream-url uses the load balancing mechanism, and the load balancing mechanism is weighted round-robin. No matter how the client's IP changes. The upstream of the backend will only be routed to the same IP. thanos-query-frontend host resource |
I found the root cause: If the store-api target not implement exemplar API, err is
https://github.com/thanos-io/thanos/blob/main/pkg/exemplars/proxy.go#L195
|
…etsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) Signed-off-by: hanjm <[email protected]>
…/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) Signed-off-by: hanjm <[email protected]>
…/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) Signed-off-by: hanjm <[email protected]>
@MrYueQ Seems not relevent? Please feel free to open another issue ~ |
…/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (#4676) (#4681) Signed-off-by: hanjm <[email protected]>
…/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (#4676) (#4681) Signed-off-by: hanjm <[email protected]>
* Sidecar: Fix process external label on promethues v2.28+ use units.Bytes config type (#4657) * Sidecar: Fix process external label when promethues v2.28+ use units.Bytes config type (#4656) Signed-off-by: hanjm <[email protected]> * E2E: Upgrade prometheus image version Signed-off-by: hanjm <[email protected]> * upgrade Prometheus dependency version to v2.30.0 (#4669) * upgrade Prometheus dependency version to v2.30.0 Signed-off-by: Ben Ye <[email protected]> * fix unit test Signed-off-by: Ben Ye <[email protected]> # Conflicts: # go.mod # go.sum * Query: Fix (*exemplarsStream).receive/(*metricMetadataStream).receive/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (#4676) (#4681) Signed-off-by: hanjm <[email protected]> * Cut 0.23.0-rc.1 Signed-off-by: Bartlomiej Plotka <[email protected]> Co-authored-by: Jimmiehan <[email protected]> Co-authored-by: Ben Ye <[email protected]>
* Cut release 0.23.0-rc.0 (#4625) Signed-off-by: Bartlomiej Plotka <[email protected]> * Updated version. Signed-off-by: Bartlomiej Plotka <[email protected]> * Cut 0.23.0-rc.1 and cherry picked 3 critical commits from main. (#4684) * Sidecar: Fix process external label on promethues v2.28+ use units.Bytes config type (#4657) * Sidecar: Fix process external label when promethues v2.28+ use units.Bytes config type (#4656) Signed-off-by: hanjm <[email protected]> * E2E: Upgrade prometheus image version Signed-off-by: hanjm <[email protected]> * upgrade Prometheus dependency version to v2.30.0 (#4669) * upgrade Prometheus dependency version to v2.30.0 Signed-off-by: Ben Ye <[email protected]> * fix unit test Signed-off-by: Ben Ye <[email protected]> # Conflicts: # go.mod # go.sum * Query: Fix (*exemplarsStream).receive/(*metricMetadataStream).receive/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (#4676) (#4681) Signed-off-by: hanjm <[email protected]> * Cut 0.23.0-rc.1 Signed-off-by: Bartlomiej Plotka <[email protected]> Co-authored-by: Jimmiehan <[email protected]> Co-authored-by: Ben Ye <[email protected]> * Cut 0.23.0 release. (#4697) * Endpointset: Do not use info client to obtain metadata (for now) (#4714) * Do not use info client to obtain metadata Signed-off-by: Matej Gera <[email protected]> * Update CHANGELOG. Signed-off-by: Matej Gera <[email protected]> * Comment out client.info usage Signed-off-by: Matej Gera <[email protected]> * Fix lint error Signed-off-by: Matej Gera <[email protected]> * Cutting 0.23.1 (#4718) Signed-off-by: Bartlomiej Plotka <[email protected]> * Moved tutorials Thanos versions to 0.23.1 Signed-off-by: Bartlomiej Plotka <[email protected]> * Added volounteer for shepharding, fixed VERSION. Signed-off-by: Bartlomiej Plotka <[email protected]> Co-authored-by: Jimmiehan <[email protected]> Co-authored-by: Ben Ye <[email protected]> Co-authored-by: Matej Gera <[email protected]>
…/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) (thanos-io#4681) Signed-off-by: hanjm <[email protected]>
This cherry-picks upstream patch that fixes the bug Query: Fix (*exemplarsStream).receive/(*metricMetadataStream).receive/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) (thanos-io#4681) See: - thanos-io#4676 (comment) - thanos-io#4681 Signed-off-by: hanjm <[email protected]> (cherry picked from commit 2d4d140) Signed-off-by: Sunil Thaha <[email protected]>
This cherry-picks upstream patch that fixes the bug Query: Fix (*exemplarsStream).receive/(*metricMetadataStream).receive/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) (thanos-io#4681) See: - thanos-io#4676 (comment) - thanos-io#4681 Signed-off-by: hanjm <[email protected]> (cherry picked from commit 2d4d140) Signed-off-by: Sunil Thaha <[email protected]>
Thanos, Prometheus and Golang version used:
Object Storage Provider:
COS
What happened:
query_exemplar response error:
error: "retrieving exemplars: proxy Exemplars: receiving exemplars from exemplars client &{0xc000b2a000}: rpc error: code = Unimplemented desc = unknown service thanos.Exemplars"
What you expected to happen:
partial response with warning.
How to reproduce it (as minimally and precisely as possible):
Thanos Query 0.23 beta + a old version sidecar not support exemplar
Full logs to relevant components:
Anything else we need to know:
seems missing flag to control exemplar partial reponse
The text was updated successfully, but these errors were encountered: