Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Improved gossip network topology #3239

Closed
ordian opened this issue Jun 14, 2021 · 4 comments · Fixed by #3270
Closed

Improved gossip network topology #3239

ordian opened this issue Jun 14, 2021 · 4 comments · Fixed by #3270
Assignees
Labels
J0-enhancement An additional feature request. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible.

Comments

@ordian
Copy link
Member

ordian commented Jun 14, 2021

Problem

Right now, we have a fully connected graph of validators and a (very naïve) messaging overlay on top (gossip-support). The problem with the current overlay is that is it ephemeral (changes during a session) and doesn't provide any guarantees about the amount of hops a message will take. This increases load on all nodes (gossip/network in our current CPU bottleneck) and potentially wastes bandwidth.

Solution

We could partition the list of all sorted authorities() into sqrt(len) groups of sqrt(len) size and form a matrix where each validator is connected to all validators in it's row and column. This is similar to web3 research proposed topology, except for the groups are not parachain groups (because not all validators are parachain validators and the group size is small), but formed randomly e.g. via BABE randomness from an epoch. This would limit the amount of gossip peers to 2 * sqrt(len) and ensure the diameter of 2.

In terms of API, we could bake the logic of creating the graph into the gossip-support subsystem as it already tracks session changes. And issue GossipPeerActivated/GossipPeerDeactivated on every session change to gossip subsystems. It can include PeerId and AuthorityDiscoveryId. We should also provide a way to map these too consistently in the presence of key rotations.

@ordian ordian added J0-enhancement An additional feature request. U2-some_time_soon Issue is worth doing soon. labels Jun 14, 2021
@ordian ordian self-assigned this Jun 14, 2021
@ordian ordian changed the title Improved validator network topology Improved gossip network topology Jun 14, 2021
@ordian ordian added U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible. and removed U2-some_time_soon Issue is worth doing soon. labels Jun 14, 2021
@burdges
Copy link
Contributor

burdges commented Jun 16, 2021

All our old grid discussion should be considered obsolete. We later decided availability needs direct connections.

I do think doing a gossip topology helps, and a grid works nicely, maybe SlimFly works too.

Approval assignments are quite weird gossip messages. They're signed by a VRF, not a regular signature. We'd maybe use a ring VRF one day. I've previously claimed approval assignments should be sent to random validators, not obey some fixed topology. I'm actually not sure how true this is. In fact, a grid or slimfly sound far better for security than any reputation based priority system! Yet, any topology imposed upon approval assignments technically changes our analysis, so maybe I'll develop stronger opinions develop in future. I forget how much a randomized gossip costs compared with a grid, etc. but I likely only our originator benefits from randomized gossip, so it need not be any worse than adding one hop. I'm happy doing a grid even for approval assignments short term, but I wanted to highlight that approval assignments are more subtle, especially if we ever feel ring VRFs help much.

@ordian
Copy link
Member Author

ordian commented Jun 16, 2021

Thanks, the fact is that the grid will work better than what we currently have and is easily implementable (PR coming soon).
It will be used for bitfield distribution, small statement distribution and approval assignment for now (we can always add some random peers into the mix).

I feel that until we have ring VRFs, there is no point in not doing that.

@burdges
Copy link
Contributor

burdges commented Jun 16, 2021

Cool. You'd use the relay chain's randomness from two epochs ago to define the grid placement?

@ordian ordian mentioned this issue Jun 16, 2021
3 tasks
@rphmeier
Copy link
Contributor

rphmeier commented Jun 16, 2021

It's worth noting that the grid is constructed over the union of

  1. The last few sessions' parachain validators
  2. The current session's parachain validators
  3. The current session's total validator set

The intersection of 1 & 2 should be quite large

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
J0-enhancement An additional feature request. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants