-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplifying rcmgr_defaults.go #9351
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The goal of this PR is to show an easier set of defaults for resource manager to reason about. This is an attempt to address #9322 The basic idea is: 1. Use these inputs: - maxMemory: can be set by user or default 1/8th the system memory - maxFD: can be set by user or default 1/2 the system limit - maxConns: can be set as the connection manager high water mark or defaults to infinity 2. Only set limits at the system and transient scope, and even there, mostly just focus on memory, FD, and inbound connections. Ingore outbout connections and stream limits. 3. Apply any limits that libp2p has for its protocols/services. This PR is not intended to be merged as is. It's not complete, undoubtedly has syntax errors, I haven't run tests, etc. It was done as a starting point to communicate specifically on how I think we can simplify the default story.
Closed
Closing this since the ideas are being incorporated into #9338 |
ajnavarro
added a commit
that referenced
this pull request
Nov 10, 2022
This PR adds several new functionalities to make easier the usage of ResourceManager: - Now resource manager logs when resources are exceeded are on ERROR instead of warning. - The resources exceeded error now shows what kind of limit was reached and the scope. - When there was no limit exceeded, we print a message for the user saying that limits are not exceeded anymore. - Added `swarm limit all` command to show all set limits with the same format as `swarm stats all` - Added `min-used-limit-perc` option to `swarm stats all` to only show stats that are above a specific percentage - Simplify a lot default values. - **Enable ResourceManager by default.** Output example: ``` 2022-11-09T10:51:40.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:51:50.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 483095 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:51:50.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:52:00.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 455294 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:00.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:52:10.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 471384 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:10.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:52:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 8 times with error "peer:12D3KooWKqcaBtcmZKLKCCoDPBuA6AXGJMNrLQUPPMsA5Q6D1eG6: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 192 times with error "peer:12D3KooWPjetWPGQUih9LZTGHdyAM9fKaXtUxDyBhA93E3JAWCXj: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 469746 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:52:30.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 484137 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:30.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 29 times with error "peer:12D3KooWPjetWPGQUih9LZTGHdyAM9fKaXtUxDyBhA93E3JAWCXj: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:30.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:52:40.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 468843 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:40.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:52:50.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 366638 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:52:50.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:53:00.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 405526 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:53:00.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 107 times with error "peer:12D3KooWQZQCwevTDGhkE9iGYk5sBzWRDUSX68oyrcfM9tXyrs2Q: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:53:00.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:53:10.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 336923 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:53:10.566+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:53:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:55 Resource limits were exceeded 71 times with error "transient: cannot reserve inbound stream: resource limit exceeded". 2022-11-09T10:53:20.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:59 Consider inspecting logs and raising the resource manager limits. Documentation: https://github.com/ipfs/kubo/blob/master/docs/config.md#swarmresourcemgr 2022-11-09T10:53:30.565+0100 ERROR resourcemanager libp2p/rcmgr_logging.go:64 Resrouce limits are no longer being exceeded. ``` ## Validation tests - Accelerated DHT client runs with no errors when ResourceManager is active. No problems were observed. - Running an attack with 200 connections and 1M streams using yamux protocol. Node was usable during the attack. With ResourceManager deactivated, the node was killed by the OS because of the amount of memory consumed. - Actions done when the attack was active: - Add files - Force a reprovide - Use the gateway to resolve an IPNS address. It closes #9001 It closes #9351 It closes #9322
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The goal of this PR is to show an easier set of defaults for resource manager to reason about. This is an attempt help address #9322 . This PR is not intended to be merged as is. It's not complete, undoubtedly has syntax errors, I haven't run tests, etc. It was done as a starting point to communicate specifically on how I think we can simplify the default story.
The basic idea is:
Only set limits at the system and transient scope, and even there, mostly just focus on memory, FD, and inbound connections. Limits are set based on the inputs. We generally ignore outbound connections and stream limits.
Apply any limits that libp2p has for its protocols/services.