Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Increase default target-peer count to 80 #3005

Closed
wants to merge 3 commits into from
Closed

[Merged by Bors] - Increase default target-peer count to 80 #3005

wants to merge 3 commits into from

Conversation

AgeManning
Copy link
Member

@AgeManning AgeManning commented Feb 7, 2022

Increase the default peer count from 50 to 80

@AgeManning
Copy link
Member Author

Potentially lets hold off on this. It may be more efficient to increase the default peer count that this PR.

Further analysis will steer the course of action here.

@paulhauner paulhauner added the work-in-progress PR is a work-in-progress label Feb 8, 2022
@AgeManning
Copy link
Member Author

I have been spending a bit of time trying to quantify the bandwidth use and performance based on some of our default parameters.

This is quite difficult to do, because there is a lot of random fluctuation in bandwidth between each beacon node restart. To get good statistical results, we need to adjust the parameters and let them run over a few days for a number of nodes to get a single reliable data point.

The fluctuation comes because each restart gives a random selection of nodes and a random number of gossipsub mesh peers that lie within the min/max counts for each topic. Its only after a long period of time, once our pruning system has a chance to steadly select out all the peers we want that we can properly measure the effect of these parameters. The effect will also differ (I expect the variance of the change to grow) based on validator count. Nodes that are subscribed to all subnets should have a greater benefit to fine-tuned parameters than nodes without validators at all. All this is to say, is that choosing parameters requires a lot of data points which will take a lot of time.

We can make educated guesses however, with fewer samples based on what theoretically we expect to occur.

The parameters we want to tune are:

  • Target peer count
  • Network Load

In theory, the bandwidth should be directly proportional to the number of mesh peers in gossipsub. Gossipsub is by far the most network heavy protocol and data we send/recv is mostly due to mesh peers. The mesh peer count is solely governed by network-load and not the peer count. So in theory we can increase our target peer count, with minimal increase in bandwidth. There will be an increase in the overhead of IWANT/IHAVE and flood publishing of messages to a larger peer set along with some extra RPC overhead such as STATUS and an increase in probability of getting out-of-sync peers which we have to send blocks to.

The benefit of a higher peer count however is that we have a greater selection of peers to collect and store and spread across all our subnets. Collecting peers across all subnets prevents us from having to perform any discovery requests and gives us a greater chance of not missing any attestations due to lack of peers on a subnet. It also allows us to find more useful peers faster because our target overhead of peer count (10%) is larger.

With all this aside I've done some simple tests with a single node (for consistency). This node is subscribed to all subnets. I found no statistical difference (there was a large variance in the bandwidth) between a peer count of 50 and 80. I saw a 20% increase in bandwidth between a network load of 3 to 4 and a 10% decrease from 3 to 2. I noticed no significant difference in block delay times.

These seem consistent with the theory. So given all of the above, and the countless hours I've spent looking at Lighthouse network metrics, I'd propose the following:

  • Leave network-load as is, untouched
  • Change our default peer count to 80 <- I think we'll gain a lot with minimal downside.

Of course if someone wants to do some more rigourous analysis, i'm all for it. What we think: @divagant-martian @pawanjay176 @michaelsproul @paulhauner ?

@AgeManning AgeManning added ready-for-review The code is ready for review and removed work-in-progress PR is a work-in-progress labels Feb 28, 2022
@divagant-martian
Copy link
Collaborator

Metrics seem fine, the only difference I see is the memory usage, by around +500M. But it looks stable

@michaelsproul
Copy link
Member

Can we change the title and description of this PR to accurately reflect what it does? i.e. increase default peer count to 80

@AgeManning AgeManning changed the title Revert network changes Increase default target-peer count to 80 Mar 1, 2022
Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited the PR description too (as it becomes the commit message)

@AgeManning
Copy link
Member Author

bors r+

@AgeManning AgeManning added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Mar 2, 2022
bors bot pushed a commit that referenced this pull request Mar 2, 2022
Increase the default peer count from 50 to 80
@bors bors bot changed the title Increase default target-peer count to 80 [Merged by Bors] - Increase default target-peer count to 80 Mar 2, 2022
@bors bors bot closed this Mar 2, 2022
pawanjay176 added a commit to pawanjay176/lighthouse that referenced this pull request Mar 2, 2022
commit 6370edd
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 16:31:18 2022 +1100

    Update testing/execution_engine_integration/src/build_geth.rs

    Co-authored-by: Michael Sproul <[email protected]>

commit f5a1dee
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 16:24:03 2022 +1100

    Fix proto-array bug

commit 4f7610f
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 16:23:52 2022 +1100

    Fix payload tests

commit cd7a0f5
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 15:22:54 2022 +1100

    Implement consensus-specs/2844

commit 5313364
Merge: d5e84a5 668115a
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 14:23:19 2022 +1100

    Merge branch 'unstable' into prev-randao-rename

commit d5e84a5
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 13:31:45 2022 +1100

    Fix windows test errors

commit 668115a
Author: Akihito Nakano <[email protected]>
Date:   Wed Mar 2 01:05:08 2022 +0000

    Rename Eth1/Eth2 in documents (sigp#3021)

    ## Issue Addressed

    Resolves sigp#3019

    ## Proposed Changes

    - Eth2 Eth2.0 Ethereum 2.0 -> Ethereum consensus
    - Eth2 network -> consensus layer
    - Ethereum 2.0 specification -> Ethereum proof-of-stake consensus specification
    - Eth2 deposit contract -> Staking deposit contract
    - Eth1 -> execution client

    ## Additional Info

    The description needs to be updated by someone who has permission to do. 📝

    <img width="487" alt="image" src="https://user-images.githubusercontent.com/1885716/153995211-816d9561-751e-4810-abb9-83d979379783.png">

commit e34524b
Author: Age Manning <[email protected]>
Date:   Wed Mar 2 01:05:07 2022 +0000

    Increase default target-peer count to 80 (sigp#3005)

    Increase the default peer count from 50 to 80

commit db2e3ae
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 11:08:40 2022 +1100

    Fix yaml indenting

commit a2f99e5
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 11:05:57 2022 +1100

    Specify go version in github actions

commit 0c5e9c6
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 11:04:15 2022 +1100

    Switch EE tests to be a binary

commit 7f07fa5
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 10:22:32 2022 +1100

    Print output of failed make command

commit fa20696
Author: Paul Hauner <[email protected]>
Date:   Wed Mar 2 10:01:00 2022 +1100

    Bump spec tests version

commit c2d7677
Author: Paul Hauner <[email protected]>
Date:   Sat Feb 26 10:17:14 2022 +1100

    Add auth port

commit 540337a
Author: Paul Hauner <[email protected]>
Date:   Sat Feb 26 10:10:06 2022 +1100

    Use `merge-kiln-v2` branch for geth

commit 1a3d32e
Author: Paul Hauner <[email protected]>
Date:   Fri Feb 25 15:08:38 2022 +1100

    Revert "Add serde "random" alias"

    This reverts commit bb02c2f.

commit 63376bd
Author: Paul Hauner <[email protected]>
Date:   Fri Feb 25 14:59:26 2022 +1100

    Add serde "random" alias

commit 375aa32
Author: Paul Hauner <[email protected]>
Date:   Fri Feb 25 10:09:38 2022 +1100

    Rename random to prev_randao

commit b6493d5
Author: Paul Hauner <[email protected]>
Date:   Tue Mar 1 22:56:47 2022 +0000

    Enforce Optimistic Sync Conditions & CLI Tests (v2) (sigp#3050)

    ## Description

    This PR adds a single, trivial commit (f5d2b27) atop sigp#2986 to resolve a tests compile error. The original author (@ethDreamer) is AFK so I'm getting this one merged ☺️

    Please see sigp#2986 for more information about the other, significant changes in this PR.

    Co-authored-by: Mark Mackey <[email protected]>
    Co-authored-by: ethDreamer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-merge This PR is ready to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants