-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All errors should be safe to be wrapped and retained #6008
Comments
We have made the first steps towards creating safe errors: this PR introduces the following interface, belonging to the
as well as a type implementing that interface:
Most of the errors returned by ingester are created by calling |
This comment describes the current status of this issue. So far error handling has been improved in the following components: ingester and distributor. Main Achievements
Report gRPC codes as labels in cortex and ingester_client request duration metricsBefore these improvements, possible values of
It wasn't therefore possible to use any gRPC code as a
Additional error detailsAdditional details about the gRPC errors returned by ingester and distributor are modelled by the following protobuf messages belonging to the
IngesterIngester is divided in 2 parts: Ingester Client (IC) and Ingester Business Logic (IS). Ingester-related errors occurring on the IS side implement the internal
IS and IC communicate via gRPC, and the errors exchanged between them should also be errors containing gRPC status codes. Previously, it wasn't always the case. For example, gRPC error corresponding to In order to switch to the new way of creating gRPC errors within IS, it is necessary to set the newly introduced experimental CLI flag
DistributorDistributor is divided in 2 parts: Distributor Client (DC) and Distributor Business Logic (DS). Distributor-related errors occurring on the DS side implement the internal
DS and DC communicate via gRPC, and the errors exchanged between them should also be errors containing gRPC status codes. Previously, it wasn't always the case. For example, gRPC error corresponding to On the other hand, there is distributor's push handler, which translates DS's errors into HTTP errors. The semantics of the push handler has not been changed by this improvement. No distributor-related configuration is needed in order to enable the new distributor error handling. It is enabled by default, meaning that all the errors between DS and DC are gRPC errors with gRPC error codes and an appropriate error details. It is worth noting that if
|
Migration planCurrent status
ProposalIf no problem is faced during the rolling out of the newly introduced CLI flags to the prod cells, the migration plan would be the following: Mimir 2.12.0
Mimir 2.14.0
Compatibility issuesCLI flag -ingester.return-only-grpc-errorsThe following table shows the incompatibility issues in the systems running different versions of distributor and ingester. The abbreviations used to denote different versions are:
|
Is your feature request related to a problem? Please describe.
Some of our errors retain references to unsafe-ly built things like labels:
mimir/pkg/ingester/errors.go
Lines 77 to 83 in e928a4b
@pracucci said:
And I think that day has come and it's time to do it. Errors are something meant to escape the execution of a function, and there's no way to ensure they won't escape the caller. It's just a matter of time for this to cause corrupted error messages or even something worse.
Describe the solution you'd like
Any error should be safe to be stored for any time, even if that means a more expensive error creation/formatting.
Describe alternatives you've considered
Just ignore the problem.
The text was updated successfully, but these errors were encountered: