Fix calculation and implementation of dial ratio and the max dialed connections limit #1670
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fixes two behaviors in
p2p/Server
which cause it to get more dialed connections than it should based on the dial ratio (which in turn impairs the ability of new nodes joining the network to find peers with open incoming connection slots):celo-blockchain/p2p/dial.go
Line 381 in 147571e
Initially I was planning to just fix #1669. But then when I tested it out and saw that it still ends up with more dialed peers than it should, I remembered this
* 2
and realized that it would have to be removed in order to get the desired behavior.This
* 2
was introduced upstream in ethereum/go-ethereum#20592. Prior to that, it would attempt one dial for every additional dialed peer that it wanted: https://github.com/ethereum/go-ethereum/blob/58cf5686eab9019cc01e202e846a6bbc70a3301d/p2p/dial.go#L149-L160The new behavior was introduced in that PR, but was not mentioned in the PR description, commented on in the code, nor commented on during PR review. But, seeing as it impairs the functioning of the dial ratio and geth was doing fine without it up to 1.9.10, I don't see a problem with removing it. It means that you will be slower to dial peers initially (though there is at any rate the
maxActiveDials
limit which defaults to 50) and if a dialed peer connection drops you will try to refill it one at a time instead of two at a time (if that makes sense), but that seems reasonable.Other changes
Updated a test to work with the new behavior (the test assumed two dial slots for every dialed peer we want).
Tested
Verified that the dial ratio is respected by connecting a node with low limits to Alfajores and seeing that it only connects to the right number of dials peers, whether or not it's running the LES server.
Related issues