Skip to content

Commit

Permalink
backport of commit 9ddd23c (#25679)
Browse files Browse the repository at this point in the history
Co-authored-by: Mike Palmiotto <[email protected]>
  • Loading branch information
hc-github-team-secure-vault-core and mpalmi authored Feb 27, 2024
1 parent 1e5763c commit 0627bb9
Show file tree
Hide file tree
Showing 14 changed files with 220 additions and 0 deletions.
74 changes: 74 additions & 0 deletions website/content/docs/concepts/request-limiter/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
layout: docs
page_title: 'Request Limiter'
description: >-
Vault provides an adaptive concurrency limiter to protect the Vault server
from overload.
---

# Request Limiter

@include 'alerts/enterprise-only.mdx'

@include 'alerts/beta.mdx'

This document contains conceptual information about the **Request Limiter** and
its user-facing effects.

## Preventing overload

The Request Limiter aims to prevent overload by proactively detecting latency
deviation from a baseline and adapting the number of allowed in-flight requests.

This is done in two phases at the beginning of an HTTP request:

1. Consult the current number of allowed in-flight requests. If the new request
would exceed this limit, immediately reject it, indicating that the client
should retry later.

2. If the request is allowed, begin a measurement of its latency, allowing the
Request Limiter to calculate a new limit.

## Resource constraints

The Request Limiter intentionally focuses on preventing overload derived from
resource-constrained operations on the Vault server. Vault focuses on two
specific types of resource constraints which commonly cause issues in production
workloads:

1. Write latency in the storage backend, resulting in a growing queue of updates
to be flushed. These writes originate primarily from `Write`-based HTTP methods.

2. CPU utilization caused by computationally expensive PKI issue requests
(generally for RSA keys). Large numbers of these requests can consume all CPU
resources, preventing timely processing of other requests such as heartbeats and
health checks.

Storage constraints can be accounted for by limiting logical requests according
to their `http.Method`. We only measure and limit requests with `Write`-based
HTTP methods. Read requests do not generally cause storage updates, meaning that
their latencies are unlikely to be correlated with storage constraints.

CPU constraints are accounted for using the same underlying library and
technique; however, they require special treatment. The maximum number of
concurrent pki/issue requests found in testing (again, specifically for RSA
keys) is far lower than the minimum tolerable write request rate.

In both cases, utilization will be effectively throttled before Vault reaches
any degraded state. The resulting `503 - Service Unavailable` is a retryable
HTTP response code, which can be handled to gracefully retry and eventually
succeed. Clients should handle this by retrying with jitter and exponential
backoff. This is done within Vault's API `Client` implementation, using the
go-retryablehttp library.

## Read requests

HTTP methods such as `GET` and `LIST` are not subject to write request
limiting. This allows operators to continue querying server state without
needing to retry.

## Vault server overloaded

When Vault has reached capacity, new requests will be immediately rejected with a
retryable `503 - Service Unavailable`
[error](/vault/docs/concepts/request-limiter/vault-server-temporarily-overloaded).
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
layout: docs
page_title: Vault server temporarily overloaded
description: |-
Vault Enterprise error when the request limiter is at capacity.
---

# Vault server temporarily overloaded

Vault returns a `503 - Service Unavailable` response to indicate that a request
was rejected after Vault has reached its in-flight request capacity:

```
Error making API request.
URL: PUT https://127.0.0.1:61555/v1/auth/userpass/login/foo
Code: 503. Errors:
* 1 error occurred:
* Vault server temporarily overloaded
```

`503 - Service Unavailable` is a retryable HTTP error, which is handled by the
Vault API `Client` implementation.

~> **NOTE**: `429 - Too Many Requests` is typically used to indicate that a
specific client is issuing too many requests. The choice of `503 - Service
Unavailable` for request rejection emphasizes that that the server is
temporarily under excess load, which may not be related to the behavior of a
specific client.

For more information on request rejection, refer to the [Request
Limiter](/vault/docs/concepts/request-limiter) documentation.
4 changes: 4 additions & 0 deletions website/content/docs/configuration/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,9 @@ can have a negative effect on performance due to the tracking of each lock attem
When `imprecise_lease_role_tracking` is set to true and a new role-based quota is enabled, subsequent lease counts start from 0.
`imprecise_lease_role_tracking` affects role-based lease count quotas, but reduces latencies when not using role based quotas.

- `request_limiter` `([Request Limiter][request-limiter]: <none>)` – Allows
operators to enable Vault's Request Limiter functionality.

### High availability parameters

The following parameters are used on backends that support [high availability][high-availability].
Expand Down Expand Up @@ -288,3 +291,4 @@ The following parameters are only used with Vault Enterprise
[sentinel]: /vault/docs/configuration/sentinel
[high-availability]: /vault/docs/concepts/ha
[plugins]: /vault/docs/plugins
[request-limiter]: /vault/docs/concepts/request-limiter
4 changes: 4 additions & 0 deletions website/content/docs/configuration/listener/tcp.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,10 @@ default value in the `"/sys/config/ui"` [API endpoint](/vault/api-docs/system/co
- `disable_replication_status_endpoints` `(bool: false)` - Disables replication
status endpoints for the configured listener when set to `true`.

- `disable_request_limiter` `(bool: false)` - Disables the request limiter for
this listener. The default configuration will honor the global
[configuration](/vault/docs/configuration/request-limiter).

### `telemetry` parameters

- `unauthenticated_metrics_access` `(bool: false)` - If set to true, allows
Expand Down
26 changes: 26 additions & 0 deletions website/content/docs/configuration/request-limiter.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
layout: docs
page_title: Request Limiter - Configuration
description: |-
The Request Limiter mitigates overload scenarios in Vault by adaptively
limiting in-flight requests based on latency measurements.
---

# `request_limiter`

@include 'alerts/enterprise-only.mdx'

@include 'alerts/beta.mdx'

The `request_limiter` stanza allows operators to turn on the adaptive
concurrency limiter, which is off by default. This is a reloadable config.

```hcl
request_limiter {
disable = false
}
```

~> **Warning** This feature is still in Tech Preview. Turning the Request
Limiter *on* may have negative effects on request success rates. Please test
your workloads before turning this on in production.
16 changes: 16 additions & 0 deletions website/content/docs/internals/telemetry/metrics/core-system.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,22 @@ Vault instance.

@include 'telemetry-metrics/vault/quota/rate_limit/violation.mdx'

## Request limiter metrics

@include 'telemetry-metrics/request-limiter-intro.mdx'

@include 'telemetry-metrics/vault/core/request-limiter/write.mdx'

@include 'telemetry-metrics/vault/core/request-limiter/special_path.mdx'

@include 'telemetry-metrics/vault/core/request-limiter/service_unavailable.mdx'

@include 'telemetry-metrics/vault/core/request-limiter/success.mdx'

@include 'telemetry-metrics/vault/core/request-limiter/dropped.mdx'

@include 'telemetry-metrics/vault/core/request-limiter/ignored.mdx'

## Rollback metrics

@include 'telemetry-metrics/rollback-intro.mdx'
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Request Limiter metrics relate to request success signals observed by the
request limiter and its current state.
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
### vault.core.limits.concurrency.dropped ((#vault-core-limits-concurrency-dropped))

Metric type | Value | Description
----------- | ------- | -----------
counter | number | Number of significant request errors oberved by the request limiter
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
### vault.core.limits.concurrency.ignored ((#vault-core-limits-concurrency-ignored))

Metric type | Value | Description
----------- | ------- | -----------
counter | number | Number of ignored request errors observed by the request limiter

Ignored request errors result from early request cancellation. These errors are
discarded from request limiter measurements to prevent skewing of latency
measurements.
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
### vault.core.limits.concurrency.service_unavailable ((#vault-core-limits-concurrency-service-unavailable))

Metric type | Value | Description
----------- | ------- | -----------
counter | number | Number of requests rejected by the request limiter
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
### vault.core.limits.concurrency.special_path ((#vault-core-limits-concurrency-special-path))

Metric type | Value | Description
----------- | ------- | -----------
gauge | number | Current number of allowed in-flight special-path requests
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
### vault.core.limits.concurrency.success ((#vault-core-limits-concurrency-success))

Metric type | Value | Description
----------- | ------- | -----------
counter | number | Number of successful requests observed by the request limiter
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
### vault.core.limits.concurrency.write ((#vault-core-limits-conccurrency-write))

Metric type | Value | Description
----------- | ------- | -----------
gauge | number | Current number of allowed in-flight write requests
27 changes: 27 additions & 0 deletions website/data/docs-nav-data.json
Original file line number Diff line number Diff line change
Expand Up @@ -300,6 +300,29 @@
"path": "concepts/filtering/audit"
}
]
},
{
"title": "Request Limiter",
"badge": {
"text": "ENTERPRISE",
"type": "outlined",
"color": "neutral"
},
"routes": [
{
"title": "Overview",
"path": "concepts/request-limiter",
"badge": {
"text": "BETA",
"type": "outlined",
"color": "highlight"
}
},
{
"title": "Vault server temporarily overloaded",
"path": "concepts/request-limiter/vault-server-temporarily-overloaded"
}
]
}
]
},
Expand Down Expand Up @@ -508,6 +531,10 @@
"title": "<code>telemetry</code>",
"path": "configuration/telemetry"
},
{
"title": "<code>Request Limiter</code>",
"path": "configuration/request-limiter"
},
{
"title": "<code>ui</code>",
"path": "configuration/ui"
Expand Down

0 comments on commit 0627bb9

Please sign in to comment.