-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Co-authored-by: Mike Palmiotto <[email protected]>
- Loading branch information
1 parent
1e5763c
commit 0627bb9
Showing
14 changed files
with
220 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
--- | ||
layout: docs | ||
page_title: 'Request Limiter' | ||
description: >- | ||
Vault provides an adaptive concurrency limiter to protect the Vault server | ||
from overload. | ||
--- | ||
|
||
# Request Limiter | ||
|
||
@include 'alerts/enterprise-only.mdx' | ||
|
||
@include 'alerts/beta.mdx' | ||
|
||
This document contains conceptual information about the **Request Limiter** and | ||
its user-facing effects. | ||
|
||
## Preventing overload | ||
|
||
The Request Limiter aims to prevent overload by proactively detecting latency | ||
deviation from a baseline and adapting the number of allowed in-flight requests. | ||
|
||
This is done in two phases at the beginning of an HTTP request: | ||
|
||
1. Consult the current number of allowed in-flight requests. If the new request | ||
would exceed this limit, immediately reject it, indicating that the client | ||
should retry later. | ||
|
||
2. If the request is allowed, begin a measurement of its latency, allowing the | ||
Request Limiter to calculate a new limit. | ||
|
||
## Resource constraints | ||
|
||
The Request Limiter intentionally focuses on preventing overload derived from | ||
resource-constrained operations on the Vault server. Vault focuses on two | ||
specific types of resource constraints which commonly cause issues in production | ||
workloads: | ||
|
||
1. Write latency in the storage backend, resulting in a growing queue of updates | ||
to be flushed. These writes originate primarily from `Write`-based HTTP methods. | ||
|
||
2. CPU utilization caused by computationally expensive PKI issue requests | ||
(generally for RSA keys). Large numbers of these requests can consume all CPU | ||
resources, preventing timely processing of other requests such as heartbeats and | ||
health checks. | ||
|
||
Storage constraints can be accounted for by limiting logical requests according | ||
to their `http.Method`. We only measure and limit requests with `Write`-based | ||
HTTP methods. Read requests do not generally cause storage updates, meaning that | ||
their latencies are unlikely to be correlated with storage constraints. | ||
|
||
CPU constraints are accounted for using the same underlying library and | ||
technique; however, they require special treatment. The maximum number of | ||
concurrent pki/issue requests found in testing (again, specifically for RSA | ||
keys) is far lower than the minimum tolerable write request rate. | ||
|
||
In both cases, utilization will be effectively throttled before Vault reaches | ||
any degraded state. The resulting `503 - Service Unavailable` is a retryable | ||
HTTP response code, which can be handled to gracefully retry and eventually | ||
succeed. Clients should handle this by retrying with jitter and exponential | ||
backoff. This is done within Vault's API `Client` implementation, using the | ||
go-retryablehttp library. | ||
|
||
## Read requests | ||
|
||
HTTP methods such as `GET` and `LIST` are not subject to write request | ||
limiting. This allows operators to continue querying server state without | ||
needing to retry. | ||
|
||
## Vault server overloaded | ||
|
||
When Vault has reached capacity, new requests will be immediately rejected with a | ||
retryable `503 - Service Unavailable` | ||
[error](/vault/docs/concepts/request-limiter/vault-server-temporarily-overloaded). |
33 changes: 33 additions & 0 deletions
33
...e/content/docs/concepts/request-limiter/vault-server-temporarily-overloaded.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
--- | ||
layout: docs | ||
page_title: Vault server temporarily overloaded | ||
description: |- | ||
Vault Enterprise error when the request limiter is at capacity. | ||
--- | ||
|
||
# Vault server temporarily overloaded | ||
|
||
Vault returns a `503 - Service Unavailable` response to indicate that a request | ||
was rejected after Vault has reached its in-flight request capacity: | ||
|
||
``` | ||
Error making API request. | ||
URL: PUT https://127.0.0.1:61555/v1/auth/userpass/login/foo | ||
Code: 503. Errors: | ||
* 1 error occurred: | ||
* Vault server temporarily overloaded | ||
``` | ||
|
||
`503 - Service Unavailable` is a retryable HTTP error, which is handled by the | ||
Vault API `Client` implementation. | ||
|
||
~> **NOTE**: `429 - Too Many Requests` is typically used to indicate that a | ||
specific client is issuing too many requests. The choice of `503 - Service | ||
Unavailable` for request rejection emphasizes that that the server is | ||
temporarily under excess load, which may not be related to the behavior of a | ||
specific client. | ||
|
||
For more information on request rejection, refer to the [Request | ||
Limiter](/vault/docs/concepts/request-limiter) documentation. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
--- | ||
layout: docs | ||
page_title: Request Limiter - Configuration | ||
description: |- | ||
The Request Limiter mitigates overload scenarios in Vault by adaptively | ||
limiting in-flight requests based on latency measurements. | ||
--- | ||
|
||
# `request_limiter` | ||
|
||
@include 'alerts/enterprise-only.mdx' | ||
|
||
@include 'alerts/beta.mdx' | ||
|
||
The `request_limiter` stanza allows operators to turn on the adaptive | ||
concurrency limiter, which is off by default. This is a reloadable config. | ||
|
||
```hcl | ||
request_limiter { | ||
disable = false | ||
} | ||
``` | ||
|
||
~> **Warning** This feature is still in Tech Preview. Turning the Request | ||
Limiter *on* may have negative effects on request success rates. Please test | ||
your workloads before turning this on in production. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 2 additions & 0 deletions
2
website/content/partials/telemetry-metrics/request-limiter-intro.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
Request Limiter metrics relate to request success signals observed by the | ||
request limiter and its current state. |
5 changes: 5 additions & 0 deletions
5
website/content/partials/telemetry-metrics/vault/core/request-limiter/dropped.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
### vault.core.limits.concurrency.dropped ((#vault-core-limits-concurrency-dropped)) | ||
|
||
Metric type | Value | Description | ||
----------- | ------- | ----------- | ||
counter | number | Number of significant request errors oberved by the request limiter |
9 changes: 9 additions & 0 deletions
9
website/content/partials/telemetry-metrics/vault/core/request-limiter/ignored.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
### vault.core.limits.concurrency.ignored ((#vault-core-limits-concurrency-ignored)) | ||
|
||
Metric type | Value | Description | ||
----------- | ------- | ----------- | ||
counter | number | Number of ignored request errors observed by the request limiter | ||
|
||
Ignored request errors result from early request cancellation. These errors are | ||
discarded from request limiter measurements to prevent skewing of latency | ||
measurements. |
5 changes: 5 additions & 0 deletions
5
...t/partials/telemetry-metrics/vault/core/request-limiter/service_unavailable.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
### vault.core.limits.concurrency.service_unavailable ((#vault-core-limits-concurrency-service-unavailable)) | ||
|
||
Metric type | Value | Description | ||
----------- | ------- | ----------- | ||
counter | number | Number of requests rejected by the request limiter |
5 changes: 5 additions & 0 deletions
5
.../content/partials/telemetry-metrics/vault/core/request-limiter/special_path.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
### vault.core.limits.concurrency.special_path ((#vault-core-limits-concurrency-special-path)) | ||
|
||
Metric type | Value | Description | ||
----------- | ------- | ----------- | ||
gauge | number | Current number of allowed in-flight special-path requests |
5 changes: 5 additions & 0 deletions
5
website/content/partials/telemetry-metrics/vault/core/request-limiter/success.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
### vault.core.limits.concurrency.success ((#vault-core-limits-concurrency-success)) | ||
|
||
Metric type | Value | Description | ||
----------- | ------- | ----------- | ||
counter | number | Number of successful requests observed by the request limiter |
5 changes: 5 additions & 0 deletions
5
website/content/partials/telemetry-metrics/vault/core/request-limiter/write.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
### vault.core.limits.concurrency.write ((#vault-core-limits-conccurrency-write)) | ||
|
||
Metric type | Value | Description | ||
----------- | ------- | ----------- | ||
gauge | number | Current number of allowed in-flight write requests |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters