Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow ingester's read path to return gRPC errors #6680

Merged
merged 7 commits into from
Nov 24, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,10 @@
* `http.StatusAccepted` (202) code is replaced with `codes.AlreadyExists`.
* `http.BadRequest` (400) code is replaced with `codes.FailedPrecondition`.
* `http.StatusTooManyRequests` (429) and the non-standard `529` (The service is overloaded) codes are replaced with `codes.ResourceExhausted`.
* [CHANGE] Ingester: by setting the newly introduced experimental CLI flag `-ingester.return-only-grpc-errors` to true, `Push()` will return only gRPC errors. This feature changes the following status codes: #6443 #6723
* `http.StatusBadRequest` (400) is replaced with `codes.FailedPrecondition`.
* `http.StatusServiceUnavailable` (503) and `codes.Unknown` are replaced with `codes.Internal`.
* [CHANGE] Ingester: by setting the newly introduced experimental CLI flag `-ingester.return-only-grpc-errors` to true, ingester will return only gRPC errors. This feature changes the following status codes: #6443 #6680 #6723
* `http.StatusBadRequest` (400) is replaced with `codes.FailedPrecondition` on the write path.
* `http.StatusServiceUnavailable` (503) is replaced with `codes.Internal` on the write path, and with `codes.ResourceExhausted` on the read path.
* `codes.Unknown` is replaced with `codes.Internal` on both write and read path.
* [CHANGE] Upgrade Node.js to v20. #6540
* [CHANGE] Querier: `cortex_querier_blocks_consistency_checks_failed_total` is now incremented when a block couldn't be queried from any attempted store-gateway as opposed to incremented after each attempt. Also `cortex_querier_blocks_consistency_checks_total` is incremented once per query as opposed to once per attempt (with 3 attempts). #6590
* [CHANGE] Ingester: Modify utilization based read path limiter to base memory usage on Go heap size. #6584
Expand Down
2 changes: 1 addition & 1 deletion pkg/distributor/distributor.go
Original file line number Diff line number Diff line change
Expand Up @@ -1247,7 +1247,7 @@ func (d *Distributor) handlePushError(ctx context.Context, pushErr error) error

// TODO This code is needed for backwards compatibility, since ingesters may still return
// errors created by httpgrpc.Errorf(). If pushErr is one of those errors, we just propagate
// it. This code should be removed in mimir 2.12.0.
// it. This code should be removed together with the removal of `-ingester.return-only-grpc-errors`.
_, ok := httpgrpc.HTTPResponseFromError(pushErr)
if ok {
return pushErr
Expand Down
12 changes: 5 additions & 7 deletions pkg/distributor/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"google.golang.org/grpc/codes"

"github.com/grafana/mimir/pkg/mimirpb"
"github.com/grafana/mimir/pkg/util"
"github.com/grafana/mimir/pkg/util/globalerror"
"github.com/grafana/mimir/pkg/util/validation"
)
Expand Down Expand Up @@ -239,21 +240,17 @@ func handleIngesterPushError(err error) error {
return errors.Wrap(err, failedPushingToIngesterMessage)
}
statusCode := stat.Code()
if isHTTPStatusCode(statusCode) {
if util.IsHTTPStatusCode(statusCode) {
// TODO This code is needed for backwards compatibility, since ingesters may still return
// errors created by httpgrpc.Errorf(). If pushErr is one of those errors, we just propagate
// it. This code should be removed in mimir 2.12.0.
// it. This code should be removed together with the removal of `-ingester.return-only-grpc-errors`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't be sure that this code should be removed when the flag is removed. That is a conclusion that should come from a migration plan, which we don't have yet.

Please review this comment once you have the migration plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed all related TODOs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

// Wrap HTTP gRPC error with more explanatory message.
return httpgrpc.Errorf(int(statusCode), "%s: %s", failedPushingToIngesterMessage, stat.Message())
}

return newIngesterPushError(stat)
}

func isHTTPStatusCode(statusCode codes.Code) bool {
return int(statusCode) >= 100 && int(statusCode) < 600
}

func isClientError(err error) bool {
var ingesterPushErr ingesterPushError
if errors.As(err, &ingesterPushErr) {
Expand All @@ -262,7 +259,8 @@ func isClientError(err error) bool {

// TODO This code is needed for backwards compatibility, since ingesters may still return
// errors with HTTP status code created by httpgrpc.Errorf(). If err is one of those errors,
// we treat 4xx errors as client errors. This code should be removed in mimir 2.12.0.
// we treat 4xx errors as client errors. This code should be removed together with the removal
// of `-ingester.return-only-grpc-errors`.

if code := grpcutil.ErrorToStatusCode(err); code/100 == 4 {
return true
Expand Down
4 changes: 2 additions & 2 deletions pkg/distributor/errors_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -357,8 +357,8 @@ func TestHandleIngesterPushError(t *testing.T) {
// Ensure that the errors created by httpgrpc get translated into
// other errors created by httpgrpc with the same code, and with
// a more explanatory message.
// TODO: this is needed for backwards compatibility and will be removed
// in mimir 2.12.0.
// TODO this method is needed only for the backwards compatibility, and should be removed
// together with the removal of `-ingester.return-only-grpc-errors`.
httpgrpcTests := map[string]struct {
ingesterPushError error
expectedStatus int32
Expand Down
5 changes: 3 additions & 2 deletions pkg/ingester/active_series.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ const activeSeriesMaxSizeBytes = 1 * 1024 * 1024

// ActiveSeries implements the ActiveSeries RPC. It returns a stream of active
// series that match the given matchers.
func (i *Ingester) ActiveSeries(request *client.ActiveSeriesRequest, stream client.Ingester_ActiveSeriesServer) error {
if err := i.checkRunning(); err != nil {
func (i *Ingester) ActiveSeries(request *client.ActiveSeriesRequest, stream client.Ingester_ActiveSeriesServer) (err error) {
defer func() { err = i.mapReadErrorToErrorWithStatus(err) }()
if err := i.checkAvailable(); err != nil {
return err
}
if err := i.checkReadOverloaded(); err != nil {
Expand Down
69 changes: 62 additions & 7 deletions pkg/ingester/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@ import (

const (
integerUnavailableMsgFormat = "ingester is unavailable (current state: %s)"
tooBusyErrorMsg = "the ingester is currently too busy to process queries, try again later"
)

var (
tooBusyError = ingesterTooBusyError{}
)

// errorWithStatus is used for wrapping errors returned by ingester.
Expand Down Expand Up @@ -74,8 +79,8 @@ func newErrorWithStatus(originalErr error, code codes.Code) errorWithStatus {

// newErrorWithHTTPStatus creates a new errorWithStatus backed by the given error,
// and containing the given HTTP status code.
// TODO this is needed for backwards compatibility only and should be removed
// in mimir 2.12.0.
// TODO this method is needed only for the backwards compatibility, and should be removed
// together with the removal of `-ingester.return-only-grpc-errors`.
func newErrorWithHTTPStatus(err error, code int) errorWithStatus {
errWithHTTPStatus := httpgrpc.Errorf(code, err.Error())
stat, _ := status.FromError(errWithHTTPStatus)
Expand Down Expand Up @@ -509,6 +514,19 @@ func (e tsdbUnavailableError) errorCause() mimirpb.ErrorCause {
// Ensure that tsdbUnavailableError is an ingesterError.
var _ ingesterError = tsdbUnavailableError{}

type ingesterTooBusyError struct{}

func (e ingesterTooBusyError) Error() string {
return tooBusyErrorMsg
}

func (e ingesterTooBusyError) errorCause() mimirpb.ErrorCause {
return mimirpb.TOO_BUSY
}

// Ensure that ingesterTooBusyError is an ingesterError.
var _ ingesterError = ingesterTooBusyError{}

type ingesterErrSamplers struct {
sampleTimestampTooOld *log.Sampler
sampleTimestampTooOldOOOEnabled *log.Sampler
Expand All @@ -535,7 +553,8 @@ func newIngesterErrSamplers(freq int64) ingesterErrSamplers {
}
}

func handlePushErrorWithGRPC(err error) error {
// mapPushErrorToErrorWithStatus maps the given error to the corresponding error of type errorWithStatus.
func mapPushErrorToErrorWithStatus(err error) error {
var (
ingesterErr ingesterError
errCode = codes.Internal
Expand All @@ -557,11 +576,11 @@ func handlePushErrorWithGRPC(err error) error {
return newErrorWithStatus(wrappedErr, errCode)
}

// handlePushErrorWithHTTPGRPC maps ingesterError objects to an appropriate
// mapPushErrorToErrorWithHTTPOrGRPCStatus maps ingesterError objects to an appropriate
// errorWithStatus, which may contain both HTTP and gRPC error codes.
// TODO this method is needed only for the backwards compatibility,
// and should be removed in mimir 2.12.0.
func handlePushErrorWithHTTPGRPC(err error) error {
// TODO this method is needed only for the backwards compatibility, and should be removed
// together with the removal of `-ingester.return-only-grpc-errors`.
func mapPushErrorToErrorWithHTTPOrGRPCStatus(err error) error {
var ingesterErr ingesterError
if errors.As(err, &ingesterErr) {
switch ingesterErr.errorCause() {
Expand All @@ -577,3 +596,39 @@ func handlePushErrorWithHTTPGRPC(err error) error {
}
return err
}

// mapReadErrorToErrorWithStatus maps the given error to the corresponding error of type errorWithStatus.
func mapReadErrorToErrorWithStatus(err error) error {
var (
ingesterErr ingesterError
errCode = codes.Internal
)
if errors.As(err, &ingesterErr) {
switch ingesterErr.errorCause() {
case mimirpb.TOO_BUSY:
errCode = codes.ResourceExhausted
case mimirpb.SERVICE_UNAVAILABLE:
errCode = codes.Unavailable
}
}
return newErrorWithStatus(err, errCode)
}

// mapReadErrorToErrorWithHTTPOrGRPCStatus maps ingesterError objects to an appropriate
// errorWithStatus, which may contain both HTTP and gRPC error codes.
// TODO this method is needed only for the backwards compatibility, and should be removed
// together with the removal of `-ingester.return-only-grpc-errors`.
func mapReadErrorToErrorWithHTTPOrGRPCStatus(err error) error {
var (
ingesterErr ingesterError
)
if errors.As(err, &ingesterErr) {
switch ingesterErr.errorCause() {
case mimirpb.TOO_BUSY:
return newErrorWithHTTPStatus(err, http.StatusServiceUnavailable)
case mimirpb.SERVICE_UNAVAILABLE:
return newErrorWithStatus(err, codes.Unavailable)
}
}
return err
}
Loading