Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResourceManager: Adjust inbound connection limits depending on memory. #9593

Merged
merged 3 commits into from
Jan 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 30 additions & 41 deletions core/node/libp2p/rcmgr_defaults.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,18 @@ var noLimitIncrease = rcmgr.BaseLimitIncrease{
// This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled

// createDefaultLimitConfig creates LimitConfig to pass to libp2p's resource manager.
// The defaults follow the documentation in docs/config.md.
// The defaults follow the documentation in docs/libp2p-resource-management.md.
// Any changes in the logic here should be reflected there.
func createDefaultLimitConfig(cfg config.SwarmConfig) (rcmgr.LimitConfig, error) {
maxMemoryDefaultString := humanize.Bytes(uint64(memory.TotalMemory()) / 4)
maxMemoryDefaultString := humanize.Bytes(uint64(memory.TotalMemory()) / 2)
maxMemoryString := cfg.ResourceMgr.MaxMemory.WithDefault(maxMemoryDefaultString)
maxMemory, err := humanize.ParseBytes(maxMemoryString)
if err != nil {
return rcmgr.LimitConfig{}, err
}

numFD := cfg.ResourceMgr.MaxFileDescriptors.WithDefault(int64(fd.GetNumFDs()) / 2)
maxMemoryMB := maxMemory / (1024 * 1024)
ajnavarro marked this conversation as resolved.
Show resolved Hide resolved
maxFD := int(cfg.ResourceMgr.MaxFileDescriptors.WithDefault(int64(fd.GetNumFDs()) / 2))

// We want to see this message on startup, that's why we are using fmt instead of log.
fmt.Printf(`
Expand All @@ -65,65 +66,53 @@ Computing default go-libp2p Resource Manager limits based on:
Applying any user-supplied overrides on top.
Run 'ipfs swarm limit all' to see the resulting limits.

`, maxMemoryString, numFD)
`, maxMemoryString, maxFD)

// At least as of 2023-01-25, it's possible to open a connection that
// doesn't ask for any memory usage with the libp2p Resource Manager/Accountant
// (see https://github.com/libp2p/go-libp2p/issues/2010#issuecomment-1404280736).
// As a result, we can't curretly rely on Memory limits to full protect us.
// Until https://github.com/libp2p/go-libp2p/issues/2010 is addressed,
// we take a proxy now of restricting to 1 inbound connection per MB.
// Note: this is more generous than go-libp2p's default autoscaled limits which do
// 64 connections per 1GB
// (see https://github.com/libp2p/go-libp2p/blob/master/p2p/host/resource-manager/limit_defaults.go#L357 ).
systemConnsInbound := int(1 * maxMemoryMB)

scalingLimitConfig := rcmgr.ScalingLimitConfig{
SystemBaseLimit: rcmgr.BaseLimit{
Memory: int64(maxMemory),
FD: int(numFD),
FD: maxFD,

// By default, we just limit connections on the inbound side.
Conns: bigEnough,
ConnsInbound: rcmgr.DefaultLimits.SystemBaseLimit.ConnsInbound, // same as libp2p default
ConnsInbound: systemConnsInbound,
ConnsOutbound: bigEnough,

// We limit streams since they not only take up memory and CPU.
// The Memory limit protects us on the memory side,
// but a StreamsInbound limit helps protect against unbound CPU consumption from stream processing.
Streams: bigEnough,
StreamsInbound: rcmgr.DefaultLimits.SystemBaseLimit.StreamsInbound,
StreamsInbound: bigEnough,
StreamsOutbound: bigEnough,
},
// Most limits don't see an increase because they're already infinite/bigEnough or at their max value.
// The values that should scale based on the amount of memory allocated to libp2p need to increase accordingly.
SystemLimitIncrease: rcmgr.BaseLimitIncrease{
Memory: 0,
FDFraction: 0,

Conns: 0,
ConnsInbound: rcmgr.DefaultLimits.SystemLimitIncrease.ConnsInbound,
ConnsOutbound: 0,

Streams: 0,
StreamsInbound: rcmgr.DefaultLimits.SystemLimitIncrease.StreamsInbound,
StreamsOutbound: 0,
},
SystemLimitIncrease: noLimitIncrease,

// Transient connections won't cause any memory to accounted for by the resource manager.
// Only established connections do.
// As a result, we can't rely on System.Memory to protect us from a bunch of transient connection being opened.
// We limit the same values as the System scope, but only allow the Transient scope to take 25% of what is allowed for the System scope.
TransientBaseLimit: rcmgr.BaseLimit{
Memory: rcmgr.DefaultLimits.TransientBaseLimit.Memory,
FD: rcmgr.DefaultLimits.TransientBaseLimit.FD,
Memory: int64(maxMemory / 4),
FD: maxFD / 4,

Conns: bigEnough,
ConnsInbound: rcmgr.DefaultLimits.TransientBaseLimit.ConnsInbound,
ConnsInbound: systemConnsInbound / 4,
ConnsOutbound: bigEnough,

Streams: bigEnough,
StreamsInbound: rcmgr.DefaultLimits.TransientBaseLimit.StreamsInbound,
StreamsInbound: bigEnough,
StreamsOutbound: bigEnough,
},

TransientLimitIncrease: rcmgr.BaseLimitIncrease{
Memory: rcmgr.DefaultLimits.TransientLimitIncrease.Memory,
FDFraction: rcmgr.DefaultLimits.TransientLimitIncrease.FDFraction,

Conns: 0,
ConnsInbound: rcmgr.DefaultLimits.TransientLimitIncrease.ConnsInbound,
ConnsOutbound: 0,

Streams: 0,
StreamsInbound: rcmgr.DefaultLimits.TransientLimitIncrease.StreamsInbound,
StreamsOutbound: 0,
},
TransientLimitIncrease: noLimitIncrease,

// Lets get out of the way of the allow list functionality.
// If someone specified "Swarm.ResourceMgr.Allowlist" we should let it go through.
Expand Down Expand Up @@ -184,7 +173,7 @@ Run 'ipfs swarm limit all' to see the resulting limits.
// Whatever limits libp2p has specifically tuned for its protocols/services we'll apply.
libp2p.SetDefaultServiceLimits(&scalingLimitConfig)

defaultLimitConfig := scalingLimitConfig.Scale(int64(maxMemory), int(numFD))
defaultLimitConfig := scalingLimitConfig.Scale(int64(maxMemory), maxFD)

// Simple checks to overide autoscaling ensuring limits make sense versus the connmgr values.
// There are ways to break this, but this should catch most problems already.
Expand Down
37 changes: 25 additions & 12 deletions docs/changelogs/v0.18.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@

## v0.18.1

This release includes improvements around Pubsub message deduplication, and more.

This release includes improvements around Pubsub message deduplication, libp2p resource management, and more.

<!-- TOC depthfrom:3 -->

- [Overview](#overview)
- [🔦 Highlights](#-highlights)
- [New default Pubsub.SeenMessagesStrategy](#new-default-pubsubseenmessagesstrategy)
- [Improving libp2p resource management integration](#improving-libp2p-resource-management-integration)
- [📝 Changelog](#-changelog)
- [👨‍👩‍👧‍👦 Contributors](#-contributors)

Expand All @@ -33,11 +33,24 @@ If you prefer the old behavior, which calculates the TTL countdown based on the
first time a message is seen, you can set `Pubsub.SeenMessagesStrategy` to
`first-seen`.

#### Improving libp2p resource management integration

This builds on the default protection nodes get against DoS (resource exhaustion) and eclipse attacks
with the [go-libp2p Network Resource Manager/Accountant](https://github.com/ipfs/kubo/blob/master/docs/libp2p-resource-management.md)
that was fine-tuned in [Kubo 0.18](https://github.com/ipfs/kubo/blob/biglep/resource-manager-example-of-what-want/docs/changelogs/v0.18.md#improving-libp2p-resource-management-integration).

Adding default hard-limits from the Resource Manager/Accountant after the fact is tricky,
and some additional improvements have been made to improve the [computed defaults](https://github.com/ipfs/kubo/blob/master/docs/libp2p-resource-management.md#computed-default-limits).
As much as possible, the aim is for a user to only think about how much memory they want to bound libp2p to,
and not need to think about translating that to hard numbers for connections, streams, etc.
More updates are likely in future Kubo releases, but with this release:
1. ``System.StreamsInbound`` is no longer bounded directly
2. ``System.ConnsInbound``, ``Transient.Memory``, ``Transiet.ConnsInbound`` have higher default computed values.

### 📝 Changelog

### 👨‍👩‍👧‍👦 Contributors


## v0.18.0

### Overview
Expand All @@ -46,22 +59,22 @@ Below is an outline of all that is in this release, so you get a sense of all th

<!-- TOC depthfrom:3 -->

- [Overview](#overview)
- [🔦 Highlights](#-highlights)
- [Content routing](#content-routing)
- [Overview](#overview)
- [🔦 Highlights](#-highlights)
- [Content routing](#content-routing)
- [Default InterPlanetary Network Indexer](#default-interplanetary-network-indexer)
- [Increase provider record republish interval and expiration](#increase-provider-record-republish-interval-and-expiration)
- [Gateways](#gateways)
- [DAG-JSON and DAG-CBOR response formats](#dag-json-and-dag-cbor-response-formats)
- [Gateways](#gateways)
- [(DAG-)JSON and (DAG-)CBOR response formats](#dag-json-and-dag-cbor-response-formats)
- [🐎 Fast directory listings with DAG sizes](#-fast-directory-listings-with-dag-sizes)
- [QUIC and WebTransport](#quic-and-webtransport)
- [QUIC and WebTransport](#quic-and-webtransport)
- [WebTransport enabled by default](#webtransport-enabled-by-default)
- [QUIC and WebTransport share a single port](#quic-and-webtransport-share-a-single-port)
- [Differentiating QUIC versions](#differentiating-quic-versions)
- [QUICv1 and WebTransport config migration](#quicv1-and-webtransport-config-migration)
- [Improving libp2p resource management integration](#improving-libp2p-resource-management-integration)
- [📝 Changelog](#-changelog)
- [👨‍👩‍👧‍👦 Contributors](#-contributors)
- [Improving libp2p resource management integration](#improving-libp2p-resource-management-integration)
- [📝 Changelog](#-changelog)
- [👨‍👩‍👧‍👦 Contributors](#-contributors)

<!-- /TOC -->

Expand Down
2 changes: 1 addition & 1 deletion docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -1843,7 +1843,7 @@ This value is also used to scale the limit on various resources at various scope
when the default limits (discussed in [libp2p resource management](./libp2p-resource-management.md)) are used.
For example, increasing this value will increase the default limit for incoming connections.

Default: `[TOTAL_SYSTEM_MEMORY]/4`
Default: `[TOTAL_SYSTEM_MEMORY]/2`
Copy link
Member

@lidel lidel Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 (not familiar with prior discussions, someone else needs to review, but adding comment in favor of this change)

👍 I know @2color run into issues when he was running on a box that had 2 GB RAM on https://fly.io/ box, and his libp2p stack only got 1/4 (512MB).

This should improve default behavior in such setups (allocating 1GB).

Type: `optionalBytes`

#### `Swarm.ResourceMgr.MaxFileDescriptors`
Expand Down
4 changes: 1 addition & 3 deletions docs/libp2p-resource-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,7 @@ The reason these scopes are chosen is because:

Within these scopes, limits are just set on
[memory](https://github.com/libp2p/go-libp2p/tree/master/p2p/host/resource-manager#memory),
[file descriptors (FD)](https://github.com/libp2p/go-libp2p/tree/master/p2p/host/resource-manager#file-descriptors), [*inbound* connections](https://github.com/libp2p/go-libp2p/tree/master/p2p/host/resource-manager#connections),
and [*inbound* streams](https://github.com/libp2p/go-libp2p/tree/master/p2p/host/resource-manager#streams).
[file descriptors (FD)](https://github.com/libp2p/go-libp2p/tree/master/p2p/host/resource-manager#file-descriptors), and [*inbound* connections](https://github.com/libp2p/go-libp2p/tree/master/p2p/host/resource-manager#connections).
Limits are set based on the `Swarm.ResourceMgr.MaxMemory` and `Swarm.ResourceMgr.MaxFileDescriptors` inputs above.

There are also some special cases where minimum values are enforced.
Expand All @@ -89,7 +88,6 @@ These become the [active limits](#how-does-one-see-the-active-limits).

While `Swarm.ResourceMgr.Limits` can be edited directly, it is also possible to use `ipfs swarm limit` command to inspect and tweak specific limits at runtime.


To see all resources that are close to hitting their respective limit:

```console
Expand Down