-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
querier: fix remote read error translation #7487
querier: fix remote read error translation #7487
Conversation
I'm investigating a case where the remote-read errors for query length don't get correctly translated. I tried adding an integration test which surprisingly passed. This means I still haven't found the problem but decided to submit the integration test anyway. Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
this PR started just as added integration tests, but I ended up finding the bug and fixing it. The description and title are up to date now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't really understand the Prometheus docs in terms of the expected return codes and content type. The docs give a JSON example as the structure for the API, but the remote read API is based on protocol buffers. At the same time the protocol buffers do not have fields for errors. So I'm still not sure if we're compliant with the docs page.
Looking at the remote read handler implementation in Prometheus looks like they're just sending errors as plain text messages (see code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job spotting the bug and fixing it! 👏
Co-authored-by: Marco Pracucci <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Dimitar Dimitrov <[email protected]>
thanks for the review @pracucci! |
i'll leave this open in case anyone else wants to give it a review. Hope to merge it before Friday |
Problem
Currently remote-read has conflicting behaviours:
PR description
This PR unifies error handling such that:
It also adds tests for this in two places
The PR also moves the content type string to pkg/querier/api since it's a complicated string that was repeated in a few places.
Note to reviewers
I couldn't really understand the Prometheus docs in terms of the expected return codes and content type. The docs give a JSON example as the structure for the API, but the remote read API is based on protocol buffers. At the same time the protocol buffers do not have fields for errors. So I'm still not sure if we're compliant with the docs page.
Because of the same there are some discrepancies with HTTP status codes on the regular query and query_range APIs - they return HTTP 422 for hit limits whereas remote read returns HTTP 400