-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft release notes for 2.11 #6702
Conversation
7cf37d1
to
48dd661
Compare
- The CLI flag `querier.prefer-streaming-chunks-from-ingesters`. | ||
- The CLI flag `querier.minimize-ingester-requests`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These flags are still considered experimental - the default value has changed to enable these features by default in this release though.
The plan is to remove the flags entirely in a future release, so these features are just always enabled by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I'll remove them from here.
- **Sampled logging of errors in the ingester.** A high-traffic Mimir cluster can occasionally become bogged down logging high volumes of repeated errors. You can now reduce the amount of errors outputted to logs by setting a sample rate via the `-ingester.error-sample-rate` CLI flag. | ||
- **Add total request size instance limit for ingesters.** This limit protects the ingesters against requests that together may cause an OOM. Enable this feature by setting the `-ingester.instance-limits.max-inflight-push-requests-bytes` CLI flag. | ||
- **Reduce the resolution of incoming native histograms samples** if the incoming sample has too many buckets compared to `-validation.max-native-histogram-buckets`. This is enabled by default but can be turned off by setting the `-validation.reduce-native-histogram-over-max-buckets` CLI flag to `false`. | ||
- **Include a `Retry-After` header in recoverable error responses from the distributor.** This can protect your Mimir cluster from clients that default to retrying very quickly. Enable this feature by setting the `-distributor.retry-after-header.enabled` CLI flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth mentioning some examples of clients that respect this header?
|
||
In Grafana Mimir 2.11 the following behavior has changed: | ||
|
||
- The distributor `Push()` endpoint will now return the following gRPC codes instead of HTTP status codes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a user, I don't know what is the distributor's Push()
endpoint, is that where I send my HTTP requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was unintentionally a little vague because I wasn't 100% sure, but this is the distributor's gRPC endpoint, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, I think it's only misleading here. Distributor doesn't have a documented gRPC API as far as I can see. This change is only internal, as this endpoint is called from the http wrapper, and these errors are translated to same as previously.
I would just remove this section.
cc @duricanikolic for more context.
Co-authored-by: Charles Korn <[email protected]>
## Features and enhancements | ||
|
||
- **Sampled logging of errors in the ingester.** A high-traffic Mimir cluster can occasionally become bogged down logging high volumes of repeated errors. You can now reduce the amount of errors outputted to logs by setting a sample rate via the `-ingester.error-sample-rate` CLI flag. | ||
- **Add total request size instance limit for ingesters.** This limit protects the ingesters against requests that together may cause an OOM. Enable this feature by setting the `-ingester.instance-limits.max-inflight-push-requests-bytes` CLI flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this work efficiently (without reading request to memory first), this needs to be used with -ingester.limit-inflight-requests-using-grpc-method-limiter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks!
- The distributor gRPC push endpoint will now return the following gRPC codes instead of HTTP status codes: | ||
- 202 (accepted) code is replaced with 6 (`ALREADY_EXISTS`). | ||
- 400 (bad request) code is replaced with 9 (`FAILED_PRECONDITION`). | ||
- 429 (too many requests) and the non-standard 529 (service is overloaded) codes are replaced with 8 (`RESOURCE_EXHAUSTED`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Distributor gRPC push endpoints are not part of public API. I don't think this needs to be mentioned in release notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for writing this. I see you've gone through a lot of PRs to get this done 😅
- **Reduce the resolution of incoming native histograms samples** if the incoming sample has too many buckets compared to `-validation.max-native-histogram-buckets`. This is enabled by default but can be turned off by setting the `-validation.reduce-native-histogram-over-max-buckets` CLI flag to `false`. | ||
- **Include a `Retry-After` header in recoverable error responses from the distributor.** This can protect your Mimir cluster from clients including Prometheus that default to retrying very quickly. Enable this feature by setting the `-distributor.retry-after-header.enabled` CLI flag. | ||
- **Improved query-scheduler performance under load.** This is particularly apparent for clusters with large numbers of queriers. | ||
- **Ingester to querier chunks streaming** reduces the memory utilization of queriers and reduces the likelihood of OOMs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a new feature or was it just enabled by default now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's enabled by default now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can mention it if you want
Co-authored-by: Dimitar Dimitrov <[email protected]>
25548bd
to
a34fb24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for writing this!
Thanks for all the input, everybody! |
* Draft release notes for 2.11 * Apply suggestions from code review Co-authored-by: Charles Korn <[email protected]> * More code review responses * More mode code review responses * Update docs/sources/mimir/release-notes/v2.11.md Co-authored-by: Dimitar Dimitrov <[email protected]> * More more more code review responses --------- Co-authored-by: Charles Korn <[email protected]> Co-authored-by: Dimitar Dimitrov <[email protected]>
What this PR does
This PR adds release notes for Mimir 2.11.0.
Which issue(s) this PR fixes or relates to
#6670
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.