Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adjust rcmgr limits for accelerated DHT client rt refresh #8982

Merged
merged 3 commits into from
Jun 2, 2022

Conversation

guseggert
Copy link
Contributor

The Accelerated DHT client periodically refreshes its routing table,
including at startup, and if Resource Manager throttling causes the
client's routing table to be incomplete, then content routing may be
degraded or broken for users.

This adjusts the default limits to a level that empirically doesn't
cause Resource Manager throttling during initial DHT client
bootstrapping. Ideally the Accelerated DHT client would handle this
scenario more gracefully, but this works for now to unblock the 0.13
release.

@guseggert guseggert requested a review from aschmahmann May 18, 2022 20:42
@guseggert guseggert added this to the go-ipfs 0.13 milestone May 18, 2022
Comment on lines 44 to 45
defaultLimits.SystemBaseLimit.ConnsOutbound = logScale(4 * maxconns)
defaultLimits.SystemBaseLimit.Conns = logScale(8 * maxconns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that only doing this in the context of 2*HighWater > defaultLimits.SystemBaseLimit.ConnsInbound makes sense. Even if your HighWater is lower (e.g. 100 connections) you might still want to allow for spikiness when your application demands it (e.g. for the accelerated DHT client routing table refreshes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a suggestion here? I'm mostly focused on making sure this works for the default scenario, not for folks tweaking connmgr or rcmgr settings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the downside of having the default outbound limits just be hardcoded here rather than a function of highwater?

To some extent I guess the question is how much "magic" is worthwhile here? If we just hard code static numbers then it's easier for folks to understand at the cost of us not being able to give users a "reasonable" default config according to parameters they've already set (i.e. highwater)

@BigLep
Copy link
Contributor

BigLep commented May 19, 2022

2022-05-19 conversation:

  1. Set outbound limits really high (e.g., 64k) (we're not touching inbound limits)
  2. @guseggert Smoke test that we no longer get the error
    We're not going to do wider scenario testing.

Pros:
This lets us ship
We at least get DoS protection on the inbound side

@guseggert guseggert force-pushed the feat/rcmgr-dht-defaults branch from cdb2a48 to 7520778 Compare May 19, 2022 17:47
@guseggert guseggert requested a review from aschmahmann May 19, 2022 19:35
@BigLep
Copy link
Contributor

BigLep commented May 23, 2022

I saw this comment coment come through:

Local debugging still shows errors due to a cap of ~2k connections, I'm not sure which limit this is coming from though.

I'm concerned about the "I'm not sure which limit this is coming from though". Is this made clear in the logs?

The Accelerated DHT client periodically refreshes its routing table,
including at startup, and if Resource Manager throttling causes the
client's routing table to be incomplete, then content routing may be
degraded or broken for users.

This adjusts the default limits to a level that empirically doesn't
cause Resource Manager throttling during initial DHT client
bootstrapping. Ideally the Accelerated DHT client would handle this
scenario more gracefully, but this works for now to unblock the 0.13
release.
@guseggert guseggert force-pushed the feat/rcmgr-dht-defaults branch from 6381d0e to 25372b0 Compare May 27, 2022 14:15
This also sets the default overall conns as a function of the outbound
and inbound conns, since they are adjusted dynamically, and it makes
the intention of the value clear.
@guseggert guseggert force-pushed the feat/rcmgr-dht-defaults branch from 25372b0 to 5c7c88f Compare June 1, 2022 19:58
@guseggert guseggert force-pushed the feat/rcmgr-dht-defaults branch from ca4dd40 to f26c96c Compare June 2, 2022 14:09
@guseggert guseggert merged commit b8617b9 into master Jun 2, 2022
@guseggert guseggert deleted the feat/rcmgr-dht-defaults branch June 2, 2022 14:23
guseggert added a commit that referenced this pull request Jun 8, 2022
* fix: adjust rcmgr limits for accelerated DHT client rt refresh

The Accelerated DHT client periodically refreshes its routing table,
including at startup, and if Resource Manager throttling causes the
client's routing table to be incomplete, then content routing may be
degraded or broken for users.

This adjusts the default limits to a level that empirically doesn't
cause Resource Manager throttling during initial DHT client
bootstrapping. Ideally the Accelerated DHT client would handle this
scenario more gracefully, but this works for now to unblock the 0.13
release.

* Set default outbound conns unconditionally

This also sets the default overall conns as a function of the outbound
and inbound conns, since they are adjusted dynamically, and it makes
the intention of the value clear.

* increase min FD limit
guseggert added a commit that referenced this pull request Jun 8, 2022
* fix: adjust rcmgr limits for accelerated DHT client rt refresh

The Accelerated DHT client periodically refreshes its routing table,
including at startup, and if Resource Manager throttling causes the
client's routing table to be incomplete, then content routing may be
degraded or broken for users.

This adjusts the default limits to a level that empirically doesn't
cause Resource Manager throttling during initial DHT client
bootstrapping. Ideally the Accelerated DHT client would handle this
scenario more gracefully, but this works for now to unblock the 0.13
release.

* Set default outbound conns unconditionally

This also sets the default overall conns as a function of the outbound
and inbound conns, since they are adjusted dynamically, and it makes
the intention of the value clear.

* increase min FD limit

(cherry picked from commit b8617b9)
guseggert added a commit that referenced this pull request Jun 8, 2022
* fix: adjust rcmgr limits for accelerated DHT client rt refresh

The Accelerated DHT client periodically refreshes its routing table,
including at startup, and if Resource Manager throttling causes the
client's routing table to be incomplete, then content routing may be
degraded or broken for users.

This adjusts the default limits to a level that empirically doesn't
cause Resource Manager throttling during initial DHT client
bootstrapping. Ideally the Accelerated DHT client would handle this
scenario more gracefully, but this works for now to unblock the 0.13
release.

* Set default outbound conns unconditionally

This also sets the default overall conns as a function of the outbound
and inbound conns, since they are adjusted dynamically, and it makes
the intention of the value clear.

* increase min FD limit

(cherry picked from commit b8617b9)
guseggert added a commit that referenced this pull request Jun 8, 2022
* fix: adjust rcmgr limits for accelerated DHT client rt refresh

The Accelerated DHT client periodically refreshes its routing table,
including at startup, and if Resource Manager throttling causes the
client's routing table to be incomplete, then content routing may be
degraded or broken for users.

This adjusts the default limits to a level that empirically doesn't
cause Resource Manager throttling during initial DHT client
bootstrapping. Ideally the Accelerated DHT client would handle this
scenario more gracefully, but this works for now to unblock the 0.13
release.

* Set default outbound conns unconditionally

This also sets the default overall conns as a function of the outbound
and inbound conns, since they are adjusted dynamically, and it makes
the intention of the value clear.

* increase min FD limit

(cherry picked from commit b8617b9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants