From 075c675b79417505b7bbd9d48bba32ba328f718c Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Wed, 22 Jul 2020 00:20:42 -0400 Subject: [PATCH 01/15] Do a small write-up on collation-generation --- .../node/collators/collation-generation.md | 28 +++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/roadmap/implementers-guide/src/node/collators/collation-generation.md b/roadmap/implementers-guide/src/node/collators/collation-generation.md index 1c828b3b0f9c..81fd07f05c93 100644 --- a/roadmap/implementers-guide/src/node/collators/collation-generation.md +++ b/roadmap/implementers-guide/src/node/collators/collation-generation.md @@ -1,9 +1,33 @@ # Collation Generation -> TODO +The collation generation subsystem is executed on collator nodes and produces candidates to be distributed to validators. If configured to produce collations for a para, it produces collations and then feeds them to the collation distribution subsystem to be distributed to validators. ## Protocol +Input: None + +Output: CollationDistributionMessage + ## Functionality -## Jobs, if any +The process of generating a collation for a parachain is very parachain-specific. As such, the details of how to do so are left beyond the scope of this description. The subsystem should be implemented as an abstract wrapper, which is aware of this configuration: + +```rust +struct CollationGenerationConfig { + key: CollatorPair, + collation_producer: Fn(params) -> async (HeadData, Vec, PoV), +} +``` + +The configuration should be optional, to allow for the case where the node is not run with the capability to collate. + +On `ActiveLeavesUpdate`: + * If there is no collation generation config, ignore. + * Otherwise, for each `activated` head in the update: + * Determine if the para is scheduled or is next up on any occupied core by fetching the `availability_cores` Runtime API. + * Determine an occupied core assumption to make about the para. The simplest thing to do is to always assume that if the para occupies a core, that the candidate will become available. Further on, this might be determined based on bitfields seen or validator requests. + * Use the Runtime API subsystem to fetch the global validation data and local validation data. + * Construct validation function params based on validation data. + * Invoke the `collation_producer`. + * Construct a `CommittedCandidateReceipt` using the outputs of the `collation_producer` and signing with the `key`. + * Dispatch a `CollationDistributionMessage::DistributeCollation(key, receipt, pov)`. From 0e9dec187f35f36ca57b45f9333af486e9295d4c Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Wed, 22 Jul 2020 00:34:39 -0400 Subject: [PATCH 02/15] preamble to collator protocol --- roadmap/implementers-guide/src/SUMMARY.md | 2 +- .../node/collators/collation-distribution.md | 9 --------- .../src/node/collators/collator-protocol.md | 17 +++++++++++++++++ .../src/parachains-overview.md | 2 +- 4 files changed, 19 insertions(+), 11 deletions(-) delete mode 100644 roadmap/implementers-guide/src/node/collators/collation-distribution.md create mode 100644 roadmap/implementers-guide/src/node/collators/collator-protocol.md diff --git a/roadmap/implementers-guide/src/SUMMARY.md b/roadmap/implementers-guide/src/SUMMARY.md index 0a7fa21d2a31..86ddabe45ba7 100644 --- a/roadmap/implementers-guide/src/SUMMARY.md +++ b/roadmap/implementers-guide/src/SUMMARY.md @@ -29,7 +29,7 @@ - [Bitfield Signing](node/availability/bitfield-signing.md) - [Collators](node/collators/README.md) - [Collation Generation](node/collators/collation-generation.md) - - [Collation Distribution](node/collators/collation-distribution.md) + - [Collator Protocol](node/collators/collator-protocol.md) - [Validity](node/validity/README.md) - [Utility Subsystems](node/utility/README.md) - [Availability Store](node/utility/availability-store.md) diff --git a/roadmap/implementers-guide/src/node/collators/collation-distribution.md b/roadmap/implementers-guide/src/node/collators/collation-distribution.md deleted file mode 100644 index 0b24ce47ca56..000000000000 --- a/roadmap/implementers-guide/src/node/collators/collation-distribution.md +++ /dev/null @@ -1,9 +0,0 @@ -# Collation Distribution - -> TODO - -## Protocol - -## Functionality - -## Jobs, if any diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md new file mode 100644 index 000000000000..d0a1d614f6a7 --- /dev/null +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -0,0 +1,17 @@ +# Collator Protocol + +The Collator Protocol implements the network protocol by which collators and validators communicate. It is used by collators to distribute collations to validators and used by validators to accept collations by collators. + +Collator-to-Validator networking is more difficult than Validator-to-Validator networking because the set of possible collators for any given para is unbounded, unlike the validator set. Validator-to-Validator networking protocols can easily be implemented as gossip because the data can be bounded, and validators can authenticate each other by their `PeerId`s for the purposes of instantiating and accepting connections. + +Since, at least at the level of the para abstraction, the collator-set for any given para is unbounded, validators need to make sure that they are receiving connections from capable and honest collators and that their bandwidth and time are not being wasted by attackers. + +Validation of candidates is a heavy task, and furthermore, the [`PoV`][PoV] itself is a large piece of data. Empirically, `PoV`s are on the order of 10MB. + +> TODO: note the incremental validation function Ximin proposes at https://github.com/paritytech/polkadot/issues/1348 + +## Protocol + +## Functionality + +[PoV]: ../../types/availability.md#proofofvalidity diff --git a/roadmap/implementers-guide/src/parachains-overview.md b/roadmap/implementers-guide/src/parachains-overview.md index 8eff66622529..71d966f6c017 100644 --- a/roadmap/implementers-guide/src/parachains-overview.md +++ b/roadmap/implementers-guide/src/parachains-overview.md @@ -18,7 +18,7 @@ Here is a description of the Inclusion Pipeline: the path a parachain block (or 1. Validators are selected and assigned to parachains by the Validator Assignment routine. 1. A collator produces the parachain block, which is known as a parachain candidate or candidate, along with a PoV for the candidate. -1. The collator forwards the candidate and PoV to validators assigned to the same parachain via the [Collation Distribution subsystem](node/collators/collation-distribution.md). +1. The collator forwards the candidate and PoV to validators assigned to the same parachain via the [Collator Protocol](node/collators/collator-protocol.md). 1. The validators assigned to a parachain at a given point in time participate in the [Candidate Backing subsystem](node/backing/candidate-backing.md) to validate candidates that were put forward for validation. Candidates which gather enough signed validity statements from validators are considered "backable". Their backing is the set of signed validity statements. 1. A relay-chain block author, selected by BABE, can note up to one (1) backable candidate for each parachain to include in the relay-chain block alongside its backing. A backable candidate once included in the relay-chain is considered backed in that fork of the relay-chain. 1. Once backed in the relay-chain, the parachain candidate is considered to be "pending availability". It is not considered to be included as part of the parachain until it is proven available. From 0017dbe240463870395101d90999b294b3970ca7 Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Wed, 22 Jul 2020 01:05:57 -0400 Subject: [PATCH 03/15] notes on protocol --- .../src/node/collators/collator-protocol.md | 18 +++++++++++++++++- .../src/types/overseer-protocol.md | 18 ++++++++++++++++++ 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index d0a1d614f6a7..5067f3e68e12 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -4,14 +4,30 @@ The Collator Protocol implements the network protocol by which collators and val Collator-to-Validator networking is more difficult than Validator-to-Validator networking because the set of possible collators for any given para is unbounded, unlike the validator set. Validator-to-Validator networking protocols can easily be implemented as gossip because the data can be bounded, and validators can authenticate each other by their `PeerId`s for the purposes of instantiating and accepting connections. -Since, at least at the level of the para abstraction, the collator-set for any given para is unbounded, validators need to make sure that they are receiving connections from capable and honest collators and that their bandwidth and time are not being wasted by attackers. +Since, at least at the level of the para abstraction, the collator-set for any given para is unbounded, validators need to make sure that they are receiving connections from capable and honest collators and that their bandwidth and time are not being wasted by attackers. Communicating across this trust-boundary is the most difficult part of this subsystem. Validation of candidates is a heavy task, and furthermore, the [`PoV`][PoV] itself is a large piece of data. Empirically, `PoV`s are on the order of 10MB. > TODO: note the incremental validation function Ximin proposes at https://github.com/paritytech/polkadot/issues/1348 +As this network protocol serves as a bridge between collators and validators, it communicates primarily with one subsystem on behalf of each. As a collator, this will receive messages from the [`CollationGeneration`][CG] subsystem. As a validator, this will communicate with the [`CandidateBacking`][CB] subsystem. + ## Protocol +Input: [`CollatorProtocolMessage`][CPM] + +Output: + - [`RuntimeApiMessage`][RAM] + - [`CandidateBackingMessage`][CBM]`::Second` + - [`NetworkBridgeMessage`][NBM] + ## Functionality [PoV]: ../../types/availability.md#proofofvalidity +[CPM]: ../../types/overseer-protocol.md#collatorprotocolmessage +[CG]: collation-generation.md +[CB]: ../backing/candidate-backing.md +[CBM]: ../../types/overseer-protocol.md#candidatebackingmesage +[RAM]: ../../types/overseer-protocol.md#runtimeapimessage +[NBM]: ../../types/overseer-protocol.md#networkbridgemessage + diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index d5d5509dd461..8b3c8d55af89 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -110,6 +110,24 @@ enum CandidateSelectionMessage { } ``` +## Collator Protocol Message + +Messages received by the [Collator Protocol subsystem](../node/collators/collator-protocol.md) + +```rust +enum CollatorProtocolMessage { + /// Signal to the collator protocol that it should connect to validators with the expectation + /// of collating on the given para. This is only expected to be called once, early on, if at all, + /// and only by the Collation Generation subsystem. As such, it will overwrite the value of + /// the previous signal. + CollateOn(ParaId), + /// Provide a collation to distribute to validators. + DistributeCollation(CommittedCandidateReceipt, PoV), + /// Report a collator as having provided an invalid collation. This should lead to disconnect + /// and blacklist of the collator. + ReportCollator(CollatorId), +``` + ## Network Bridge Message Messages received by the network bridge. This subsystem is invoked by others to manipulate access From 54cc7654479723fb45e72974053ffb3299550a5f Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Wed, 22 Jul 2020 01:29:37 -0400 Subject: [PATCH 04/15] collation-generation: point to collator protocol --- .../src/node/collators/collation-generation.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/roadmap/implementers-guide/src/node/collators/collation-generation.md b/roadmap/implementers-guide/src/node/collators/collation-generation.md index 81fd07f05c93..a182fbadd307 100644 --- a/roadmap/implementers-guide/src/node/collators/collation-generation.md +++ b/roadmap/implementers-guide/src/node/collators/collation-generation.md @@ -1,6 +1,6 @@ # Collation Generation -The collation generation subsystem is executed on collator nodes and produces candidates to be distributed to validators. If configured to produce collations for a para, it produces collations and then feeds them to the collation distribution subsystem to be distributed to validators. +The collation generation subsystem is executed on collator nodes and produces candidates to be distributed to validators. If configured to produce collations for a para, it produces collations and then feeds them to the [Collator Protocol][CP] subsystem, which handles the networking. ## Protocol @@ -30,4 +30,7 @@ On `ActiveLeavesUpdate`: * Construct validation function params based on validation data. * Invoke the `collation_producer`. * Construct a `CommittedCandidateReceipt` using the outputs of the `collation_producer` and signing with the `key`. - * Dispatch a `CollationDistributionMessage::DistributeCollation(key, receipt, pov)`. + * Dispatch a [`CollatorProtocolMessage`][CPM]`::DistributeCollation(receipt, pov)`. + +[CP]: collator-protocol.md +[CPM]: ../../types/overseer-protocol.md#collatorprotocolmessage From 2b079abdb171673ceb9b3301919b947734adaf6b Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Wed, 22 Jul 2020 01:31:10 -0400 Subject: [PATCH 05/15] fix missing bracket --- roadmap/implementers-guide/src/types/overseer-protocol.md | 1 + 1 file changed, 1 insertion(+) diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index 8b3c8d55af89..991aed8b5936 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -126,6 +126,7 @@ enum CollatorProtocolMessage { /// Report a collator as having provided an invalid collation. This should lead to disconnect /// and blacklist of the collator. ReportCollator(CollatorId), +} ``` ## Network Bridge Message From 31092a12cbca4ed9ac7c54eea330017a537db80c Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Wed, 22 Jul 2020 19:53:33 -0400 Subject: [PATCH 06/15] expand on collator protocol wire protocol --- .../src/node/collators/collator-protocol.md | 127 +++++++++++++++++- .../src/types/overseer-protocol.md | 2 + 2 files changed, 128 insertions(+), 1 deletion(-) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index 5067f3e68e12..9c9ab710b2e5 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -23,11 +23,136 @@ Output: ## Functionality +```rust +type RequestId = u64; + +/// A message to our guarded validator, when acting as a sentry node. +enum ToOurValidatorMessage { + /// Forward an advertised collation to our validator. + AdvertisedCollation(Hash, CollatorId, ParaId), + /// A requested collation. `None` if the collator didn't provide it. + RequestedCollation(RequestId, Hash, Option<(CandidateReceipt, PoV)>), +} + +/// A message to our sentry node, when a validator. +enum ToOurSentryMessage { + /// Request a collation of the specific collator/validator pair via the + /// sentry. + RequestCollation(RequestId, Hash, CollatorId, ParaId), + /// Blacklist a collator and the peer representing it. + BlacklistCollator(CollatorId), + /// Note a good collation from a collator. + NoteGoodCollation(CollatorId), +} + +enum WireMessage { + /// A wire message to our validator. + ToOurValidator(ToOurValidatorMessage), + /// A wire message to our sentry. + ToOurSentry(ToOurSentryMessage), + + /// Advertise a collation to a validator. + AdvertiseCollation(Hash, CollatorId, ParaId), + /// Request the advertised collation at that relay-parent. + RequestCollation(RequestId, Hash, ParaId), + /// A requested collation. + Collation(RequestId, CandidateReceipt, PoV), +} +``` + +One of the main necessities of this protocol is to deal with the validator/sentry node duality. Validators aren't expected to expose their node to public connections and instead expose sentry nodes on their behalf. These sentry nodes serve as relays, with protocol-level validation being done to prevent spam. + +Since this protocol functions both for validators and collators, it is easiest to go through the protocol actions for each of them separately. + +Validators, Collators, and sentry nodes: +```dot process +digraph { + c1 [shape=MSquare, label="Collator 1"]; + c2 [shape=MSquare, label="Collator 2"]; + + s1 [label = "Sentry Node"]; + + v1 [shape=MSquare, label="Validator 1"]; + v2 [shape=MSquare, label="Validator 2"]; + + c1 -> s1 -> v1; + c2 -> s1; + c1 -> v2; +} +``` + +### Collators + +It is assumed that collators are only collating on a single parachain. Collations are generated by the [Collation Generation][CG] subsystem. We will keep up to one local collation per relay-parent, based on `DistributeCollation` messages. If the para is not scheduled or next up on any core, at the relay-parent, or the relay-parent isn't in the active-leaves set, we ignore the message as it must be invalid in that case - although this indicates a logic error elsewhere in the node. + +We keep track of the Para ID we are collating on as a collator. This starts as `None`, and is updated with each `CollateOn` message received. If the `ParaId` of a collation requested to be distributed does not match the one we expect, we ignore the message. + +As with most other subsystems, we track the active leaves set by following `ActiveLeavesUpdate` signals. + +For the purposes of actually distributing a collation, we need to be connected to the validators who are interested in collations on that `ParaId` at this point in time or sentry nodes that represent them. We assume that there is a discovery API for connecting to a set of validators. + +> TODO: design & expose the discovery API not just for connecting to such peers but also to determine which of our current peers are validators. + +As seen in the [Scheduler Module][SCH] of the runtime, validator groups are fixed for an entire session and their rotations across cores are predictable. Collators will want to do these things when attempting to distribute collations at a given relay-parent: + * Determine which core the para collated-on is assigned to. + * Determine the group on that core and the next group on that core. + * Issue a discovery request for the validators of the current group and the next group. + +Once connected to the relevant peers for the current group assigned to the core (transitively, the para), advertise the collation to any of them which advertise the relay-parent in their view (as provided by the [Network Bridge][NB]). If any respond with a request for the full collation, provide it. Upon receiving a view update from any of these peers which includes a relay-parent for which we have a collation that they will find relevant, advertise the collation to them if we haven't already. + +### Validators and Sentry Nodes + +Validators are not required to run with sentry nodes, so the code here needs to handle both the case where we run with and without. + +One of the main challenges of running with a sentry node is making sure that the state of the sentry is synchronized with the state of the validator node. We use `View` updates for that purpose. Sentry nodes are responsible for forwarding advertisements, requests, and responses to peers. + +```dot process +digraph G { + label = "Providing Collation via Sentry Node"; + labelloc = "t"; + rankdir = LR; + + subgraph cluster_collator { + rank = min; + label = "Collator"; + graph[style = border, rank = min]; + + c1, c2 [label = ""]; + } + + subgraph cluster_sentry { + rank = same; + label = "Sentry"; + graph[style = border]; + + s1, s2, s3, s4 [rank = same, label = ""]; + } + + subgraph cluster_validator { + rank = same; + label = "Validator"; + graph[style = border]; + + v1, v2 [label = ""]; + } + + c1 -> s1 -> v1 [label = "Advertise"]; + + v1 -> s2 -> c2 [label = "Request"]; + + c2 -> s3 [xlabel = "Provide"]; + s3 -> v2 [label = "Provide"]; + + v2 -> s4 [xlabel = "Note Good/Bad"]; +} +``` + [PoV]: ../../types/availability.md#proofofvalidity [CPM]: ../../types/overseer-protocol.md#collatorprotocolmessage [CG]: collation-generation.md [CB]: ../backing/candidate-backing.md +[NB]: ../utility/network-bridge.md [CBM]: ../../types/overseer-protocol.md#candidatebackingmesage [RAM]: ../../types/overseer-protocol.md#runtimeapimessage [NBM]: ../../types/overseer-protocol.md#networkbridgemessage - +[SCH]: ../../runtime/scheduler.md diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index 6f304a54472a..b48828879f25 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -131,6 +131,8 @@ enum CollatorProtocolMessage { /// of collating on the given para. This is only expected to be called once, early on, if at all, /// and only by the Collation Generation subsystem. As such, it will overwrite the value of /// the previous signal. + /// + /// This should be sent before any `DistributeCollation` message. CollateOn(ParaId), /// Provide a collation to distribute to validators. DistributeCollation(CommittedCandidateReceipt, PoV), From 4bbc358eed652d640caf93a98de0e51251acc015 Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Thu, 23 Jul 2020 12:06:34 -0400 Subject: [PATCH 07/15] add a couple more sentences --- .../src/node/collators/collator-protocol.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index 9c9ab710b2e5..11213eb7c861 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -147,6 +147,10 @@ digraph G { } ``` +At any step in the above diagram, the sentry node will be doing validation of the network statements sent by the collator and can report or disconnect the collator. + +The protocol tracks advertisements received and the source of the advertisement. The advertisement source is either `Direct(PeerId)` or `Sentry(PeerId)`. We accept one advertisement per collator per source per relay-parent. + [PoV]: ../../types/availability.md#proofofvalidity [CPM]: ../../types/overseer-protocol.md#collatorprotocolmessage [CG]: collation-generation.md From 17d999c1b723ce9a9e8d2843549b538783b18df4 Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Thu, 23 Jul 2020 20:05:02 -0400 Subject: [PATCH 08/15] expand on requests some more --- .../src/node/collators/collator-protocol.md | 94 ++++++++++++++++++- 1 file changed, 91 insertions(+), 3 deletions(-) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index 11213eb7c861..9a07151ca4b9 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -18,7 +18,6 @@ Input: [`CollatorProtocolMessage`][CPM] Output: - [`RuntimeApiMessage`][RAM] - - [`CandidateBackingMessage`][CBM]`::Second` - [`NetworkBridgeMessage`][NBM] ## Functionality @@ -51,8 +50,11 @@ enum WireMessage { /// A wire message to our sentry. ToOurSentry(ToOurSentryMessage), - /// Advertise a collation to a validator. - AdvertiseCollation(Hash, CollatorId, ParaId), + /// Declare the intent to advertise collations under a collator ID. + Declare(CollatorId), + /// Advertise a collation to a validator. Can only be sent once the peer has declared + /// that they are a collator with given ID. + AdvertiseCollation(Hash, ParaId), /// Request the advertised collation at that relay-parent. RequestCollation(RequestId, Hash, ParaId), /// A requested collation. @@ -151,6 +153,92 @@ At any step in the above diagram, the sentry node will be doing validation of th The protocol tracks advertisements received and the source of the advertisement. The advertisement source is either `Direct(PeerId)` or `Sentry(PeerId)`. We accept one advertisement per collator per source per relay-parent. +We use the `NetworkBridgeUpdate::OurViewChange` to track which heads we consider in our active leaves set and have communicated to peers. We ensure we keep records for all active leaves, mutating based on incoming `ActiveLeavesUpdate`s. The records have this form: + +```rust +struct CollationAdvertisements { + received_advertisements: Set<(Collator, Source)>, + advertisements: Map>>, +} +``` + +We also keep a record of what we are fetching for each leaf/relay-parent: + +```rust +struct CollationFetch { + unserved_fetch: Map, + fetch_inflight: Map, + fetched_candidates: Map, +} + +struct FetchMetadata { + Collator, + Source, + ParaId, +} + +enum FetchDest { + Subsystem(fn(CandidateReceipt, PoV) -> AllMessages), + Validator(RequestId, PeerId), +} +``` + +We also keep a record per-peer: + +```rust +struct PeerData { + role: ObservedRole, + live_requests: Map, // relay-parent the request is under. + collator_id: Option, + view: View, +} +``` + +On receiving a `NetworkBridgeUpdate::PeerViewChange`: + * If the peer is our guarded validator, for each new leaf in their view that is also in our view, send a `ToOurValidator::AdvertisedCollation` for each advertisement with source `Direct` to each peer who is our validator. + +On receiving a `NetworkBridgeUpdate::PeerDisconnected`: + * Remove all received advertisements with source being this peer. If this peer is one of our sentry nodes, this includes advertisements with the peer being the sentry. + * Call `request_timed_out` for each `(request_id, hash)` in the peer's `live_requests` map. + +On receiving a `WireMessage::AdvertiseCollation(relay-parent, collator, para)`: + * `receive_advertisement(relay_parent, Direct(sender), collator, para)` + +`WireMessage::RequestCollation` handling logic is addressed in the section on the collator side of the protocol. + +On receiving a `WireMessage::Collation(request_id, candidate_receipt, pov)`: + * If the sender doesn't have `request_id` in its `PeerData`: report and ignore. Remove the `request_id` and take the relay-parent associated. + * `receive_collation(request_id, relay-parent, candidate_receipt, pov)` + +On receiving a `ToOurSentryMessage::RequestCollation(request_id, relay_parent, collator_id, para_id)`: + * If the sender is not our guarded validator, report & ignore. + * If we have no record for that collator and para under the relay parent, respond with `ToOurValidatorMessage::RequestedCollation(request_id, hash, None)` and return. + * Otherwise, generate a new request ID r and issue a `WireMessage::RequestCollation(r, hash, para_id)` to the source. Note the request as in-flight with `FetchDest::Validator(sender)` and note the request-id under the peer data. + * Start a time-out for the request, which will call `request_timed_out(r, relay_parent)` if too long is taken. + +On receiving `ToOurSentryMessage::BlacklistCollator(collator)`: + * Report & ignore unless coming from our validator + * Issue necessary reputational changes. If any peer advertises as the given collator, disconnect the peer and ignore any advertisements from that collator from that point onwards. + +On receiving `ToOurSentryMessage::NoteGoodCollation(collator)`: + * Report & ignore unless coming from our validator. + * Issue positive reputational change for all peers advertising as the collator. + +`receive_advertisement(relay_parent, source, collator, para)`: + * If we have no record for the relay-parent, report and ignore. + * If we have already received an advertisement from this collator from the source, report and ignore. + * Make a record in `received_advertisements` and `advertisements` for the relay-parent. + * If `unserved_fetch` contains the para ID, attempt to initiate a new fetch request. + +`receive_collation(request_id, relay_parent, candidate_receipt, pov)`: + * If there is not an entry in `fetch_inflight` for the request ID under that relay-parent, report sender & ignore. + * Check that the receipt's collator and para_id match the metadata of the fetch. + * Notify each `FetchDest` of the collation: for `FetchDest::Subsystem`, this sends a message to the subsystem. For `FetchDest::Validator(request_id, peer_id)`, this sends a `ToOurValidator::RequestedCollation(request_id, relay_parent, relay_parent, Some((candidate_receipt, pov)))` to the `peer_id`. + +`request_timed_out(request_id, relay_parent)`: + * Remove the request from tracking under `relay_parent` and add the `ParaId` back to `unserved_fetch` with all `FetchDest::Subsystem` but removing all `FetchDest::Validator`. For all `FetchDest::Validator`, send the validator a `ToValidatorMessage::RequestedCollation(request_id, relay_parent, None)`. + * Attempt to initiate a new fetch request, if we restored an entry in `unserved_fetch`. + [PoV]: ../../types/availability.md#proofofvalidity [CPM]: ../../types/overseer-protocol.md#collatorprotocolmessage [CG]: collation-generation.md From 1d529d5312c2b24d00c568783763fa8744596184 Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Thu, 23 Jul 2020 20:47:39 -0400 Subject: [PATCH 09/15] go higher level --- .../src/node/collators/collator-protocol.md | 74 +++++++------------ .../src/types/overseer-protocol.md | 6 +- 2 files changed, 32 insertions(+), 48 deletions(-) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index 9a07151ca4b9..bfc8c5ae5a5b 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -151,8 +151,26 @@ digraph G { At any step in the above diagram, the sentry node will be doing validation of the network statements sent by the collator and can report or disconnect the collator. +When peers connect to us, they can `Declare` that they represent a collator with given public key. Once they've declared that, they can begin to send advertisements of collations. The peers should not send us any advertisements for collations that are on a relay-parent outside of our view. + The protocol tracks advertisements received and the source of the advertisement. The advertisement source is either `Direct(PeerId)` or `Sentry(PeerId)`. We accept one advertisement per collator per source per relay-parent. +If we're a sentry node, we relay advertisements to our guarded validator, but only those which reference a relay-parent in both our and our validator's view. This means that when we get a view update from our validator, we may need to forward it advertisements. + +If we're a validator, we will receive advertisements from all of our sentries. Although it's not expected, we may also receive advertisements directly from connected peers. This may occur when we are connected to a collator as a reserved peer. + +As a validator, we will handle requests from other subsystems to fetch a collation on a specific `ParaId` and relay-parent. These requests are made with the [`CollatorProtocolMessage`][CPM]`::FetchCollation`. To do so, we need to first check if we have already gathered a collation on that `ParaId` and relay-parent. If not, we need to select one of the advertisements and issue a request for it. If we've already issued a request, we shouldn't issue another one until the first has returned. + +The type of request we issue depends on the source of the advertisement. The advertisement may have a `Direct` source, in which case we issue a `WireMessage::RequestCollation`, or it may have a `Sentry` source, in which case we issue a `WireMessage::ToOurSentry::RequestCollation`. If the request times out, we need to note the collator as being unreliable and reduce its priority relative to other collators. And then make another request - until we get a response. + +If we receive a `ToOurSentry::RequestCollation` as a sentry, we need to act as a proxy and issue a request to the peer who made the particular advertisement being referenced. If it times out or the peer disconnects or sends invalid data, we need to respond to our validator with `ToOurValidator::RequestedCollation(..., None)`. If the request comes from a peer who is not our validator, we need to report and disconnect the peer, as this is a clear breach of protocol. + +As a validator, once the collation has been fetched some other subsystem will inspect and do deeper validation of the collation. The subsystem will report to this subsystem with a [`CollatorProtocolMessage`][CPM]`::ReportCollator` or `NoteGoodCollation` message. In that case, if we are connected directly to the collator, we apply a cost to the `PeerId` associated with the collator. Otherwise, if we're connected to the collator via a sentry node, we issue a `ToOurSentryMesage` of the corresponding type. As the recipient of such a message on a sentry node, we carry out the requisite action. When handling a report that a collator is bad, we'd want to cancel all requests to that collator. + +### Validator and Sentry nodes: Practicalities + +> Note: everything below this point is tentative and subject to change. + We use the `NetworkBridgeUpdate::OurViewChange` to track which heads we consider in our active leaves set and have communicated to peers. We ensure we keep records for all active leaves, mutating based on incoming `ActiveLeavesUpdate`s. The records have this form: ```rust @@ -166,11 +184,18 @@ We also keep a record of what we are fetching for each leaf/relay-parent: ```rust struct CollationFetch { - unserved_fetch: Map, + // Fetch requests from subsystems + fetch_requests: Map, fetch_inflight: Map, fetched_candidates: Map, } +enum FetchState { + Pending([fn(CandidateReceipt, PoV)]), + Inflight(RequestId), + Completed(Collator, Source), +} + struct FetchMetadata { Collator, Source, @@ -178,7 +203,7 @@ struct FetchMetadata { } enum FetchDest { - Subsystem(fn(CandidateReceipt, PoV) -> AllMessages), + Subsystem(fn(CandidateReceipt, PoV)), Validator(RequestId, PeerId), } ``` @@ -194,51 +219,6 @@ struct PeerData { } ``` -On receiving a `NetworkBridgeUpdate::PeerViewChange`: - * If the peer is our guarded validator, for each new leaf in their view that is also in our view, send a `ToOurValidator::AdvertisedCollation` for each advertisement with source `Direct` to each peer who is our validator. - -On receiving a `NetworkBridgeUpdate::PeerDisconnected`: - * Remove all received advertisements with source being this peer. If this peer is one of our sentry nodes, this includes advertisements with the peer being the sentry. - * Call `request_timed_out` for each `(request_id, hash)` in the peer's `live_requests` map. - -On receiving a `WireMessage::AdvertiseCollation(relay-parent, collator, para)`: - * `receive_advertisement(relay_parent, Direct(sender), collator, para)` - -`WireMessage::RequestCollation` handling logic is addressed in the section on the collator side of the protocol. - -On receiving a `WireMessage::Collation(request_id, candidate_receipt, pov)`: - * If the sender doesn't have `request_id` in its `PeerData`: report and ignore. Remove the `request_id` and take the relay-parent associated. - * `receive_collation(request_id, relay-parent, candidate_receipt, pov)` - -On receiving a `ToOurSentryMessage::RequestCollation(request_id, relay_parent, collator_id, para_id)`: - * If the sender is not our guarded validator, report & ignore. - * If we have no record for that collator and para under the relay parent, respond with `ToOurValidatorMessage::RequestedCollation(request_id, hash, None)` and return. - * Otherwise, generate a new request ID r and issue a `WireMessage::RequestCollation(r, hash, para_id)` to the source. Note the request as in-flight with `FetchDest::Validator(sender)` and note the request-id under the peer data. - * Start a time-out for the request, which will call `request_timed_out(r, relay_parent)` if too long is taken. - -On receiving `ToOurSentryMessage::BlacklistCollator(collator)`: - * Report & ignore unless coming from our validator - * Issue necessary reputational changes. If any peer advertises as the given collator, disconnect the peer and ignore any advertisements from that collator from that point onwards. - -On receiving `ToOurSentryMessage::NoteGoodCollation(collator)`: - * Report & ignore unless coming from our validator. - * Issue positive reputational change for all peers advertising as the collator. - -`receive_advertisement(relay_parent, source, collator, para)`: - * If we have no record for the relay-parent, report and ignore. - * If we have already received an advertisement from this collator from the source, report and ignore. - * Make a record in `received_advertisements` and `advertisements` for the relay-parent. - * If `unserved_fetch` contains the para ID, attempt to initiate a new fetch request. - -`receive_collation(request_id, relay_parent, candidate_receipt, pov)`: - * If there is not an entry in `fetch_inflight` for the request ID under that relay-parent, report sender & ignore. - * Check that the receipt's collator and para_id match the metadata of the fetch. - * Notify each `FetchDest` of the collation: for `FetchDest::Subsystem`, this sends a message to the subsystem. For `FetchDest::Validator(request_id, peer_id)`, this sends a `ToOurValidator::RequestedCollation(request_id, relay_parent, relay_parent, Some((candidate_receipt, pov)))` to the `peer_id`. - -`request_timed_out(request_id, relay_parent)`: - * Remove the request from tracking under `relay_parent` and add the `ParaId` back to `unserved_fetch` with all `FetchDest::Subsystem` but removing all `FetchDest::Validator`. For all `FetchDest::Validator`, send the validator a `ToValidatorMessage::RequestedCollation(request_id, relay_parent, None)`. - * Attempt to initiate a new fetch request, if we restored an entry in `unserved_fetch`. - [PoV]: ../../types/availability.md#proofofvalidity [CPM]: ../../types/overseer-protocol.md#collatorprotocolmessage [CG]: collation-generation.md diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index b48828879f25..3b425c1afa7e 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -135,10 +135,14 @@ enum CollatorProtocolMessage { /// This should be sent before any `DistributeCollation` message. CollateOn(ParaId), /// Provide a collation to distribute to validators. - DistributeCollation(CommittedCandidateReceipt, PoV), + DistributeCollation(CandidateReceipt, PoV), + /// Fetch a collation under the given relay-parent for the given ParaId. + FetchCollation(Hash, ParaId, ResponseChannel<(CandidateReceipt, PoV)>), /// Report a collator as having provided an invalid collation. This should lead to disconnect /// and blacklist of the collator. ReportCollator(CollatorId), + /// Note a collator as having provided a good collation. + NoteGoodCollation(CollatorId), } ``` From 7700b25d80b2690bcd6115e61c4e412ed795915a Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Thu, 23 Jul 2020 20:59:22 -0400 Subject: [PATCH 10/15] network bridge: note peerset --- .../src/node/collators/collator-protocol.md | 2 ++ .../src/node/utility/network-bridge.md | 11 +++++++++-- .../src/types/overseer-protocol.md | 16 +++++++++++++--- 3 files changed, 24 insertions(+), 5 deletions(-) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index bfc8c5ae5a5b..d10759a273ff 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -22,6 +22,8 @@ Output: ## Functionality +This network protocol uses the `Collation` peer-set of the [`NetworkBridge`][NB]. + ```rust type RequestId = u64; diff --git a/roadmap/implementers-guide/src/node/utility/network-bridge.md b/roadmap/implementers-guide/src/node/utility/network-bridge.md index 09c7e081a6ab..766a7c4003b8 100644 --- a/roadmap/implementers-guide/src/node/utility/network-bridge.md +++ b/roadmap/implementers-guide/src/node/utility/network-bridge.md @@ -8,19 +8,24 @@ One other piece of shared state to track is peer reputation. When peers are foun So in short, this Subsystem acts as a bridge between an actual network component and a subsystem's protocol. +The other component of the network bridge is which peer-set to use. Different peer-sets can be connected for different purposes. The network bridge is not generic over peer-set, but instead exposes two peer-sets that event producers can attach to: `Validation` and `Collation`. More information can be found on the documentation of the [`NetworkBridgeMessage`][NBM]. + ## Protocol -Input: [`NetworkBridgeMessage`](../../types/overseer-protocol.md#network-bridge-message) +Input: [`NetworkBridgeMessage`][NBM] Output: Varying, based on registered event producers. ## Functionality -Track a set of all Event Producers, each associated with a 4-byte protocol ID. +Track a set of all Event Producers, each associated with a 4-byte protocol ID and the `PeerSet` it is associated on. + There are two types of network messages this sends and receives: - ProtocolMessage(ProtocolId, Bytes) - ViewUpdate(View) +Each of these network messages is associated with a particular peer-set. If we are connected to the same peer on both peer-sets, we will receive two `ViewUpdate`s from them every time they change their view. + `ActiveLeavesUpdate`'s `activated` and `deactivated` lists determine the evolution of our local view over time. A `ViewUpdate` is issued to each connected peer after each update, and a `NetworkBridgeUpdate::OurViewChange` is issued for each registered event producer. On `RegisterEventProducer`: @@ -44,3 +49,5 @@ On `ReportPeer` message: On `SendMessage` message: - Issue a corresponding `ProtocolMessage` to each listed peer with given protocol ID and bytes. + +[NBM]: ../../types/overseer-protocol.md#network-bridge-message diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index 3b425c1afa7e..2ff21dc40dee 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -152,14 +152,24 @@ Messages received by the network bridge. This subsystem is invoked by others to to the low-level networking code. ```rust +/// Peer-sets handled by the network bridge. +enum PeerSet { + /// The collation peer-set is used to distribute collations from collators to validators. + Collation, + /// The validation peer-set is used to distribute information relevant to parachain + /// validation among validators. This may include nodes which are not validators, + /// as some protocols on this peer-set are expected to be gossip. + Validation, +} + enum NetworkBridgeMessage { /// Register an event producer with the network bridge. This should be done early and cannot /// be de-registered. - RegisterEventProducer(ProtocolId, Fn(NetworkBridgeEvent) -> AllMessages), + RegisterEventProducer(PeerSet, ProtocolId, Fn(NetworkBridgeEvent) -> AllMessages), /// Report a cost or benefit of a peer. Negative values are costs, positive are benefits. - ReportPeer(PeerId, cost_benefit: i32), + ReportPeer(PeerSet, PeerId, cost_benefit: i32), /// Send a message to one or more peers on the given protocol ID. - SendMessage([PeerId], ProtocolId, Bytes), + SendMessage(PeerSet, [PeerId], ProtocolId, Bytes), } ``` From 6eb751c70d8671bea49cc05ddfb1ec9906efccb1 Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Thu, 23 Jul 2020 21:01:01 -0400 Subject: [PATCH 11/15] note peer-set = validation for protocols --- .../src/node/availability/availability-distribution.md | 2 +- .../src/node/availability/bitfield-distribution.md | 2 +- .../implementers-guide/src/node/backing/pov-distribution.md | 4 ++-- .../src/node/backing/statement-distribution.md | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/roadmap/implementers-guide/src/node/availability/availability-distribution.md b/roadmap/implementers-guide/src/node/availability/availability-distribution.md index 008f3e91fbe7..3c530bde8450 100644 --- a/roadmap/implementers-guide/src/node/availability/availability-distribution.md +++ b/roadmap/implementers-guide/src/node/availability/availability-distribution.md @@ -6,7 +6,7 @@ After a candidate is backed, the availability of the PoV block must be confirmed ## Protocol -`ProtocolId`:`b"avad"` +`ProtocolId`:`b"avad"`: `PeerSet`: `Validation` Input: diff --git a/roadmap/implementers-guide/src/node/availability/bitfield-distribution.md b/roadmap/implementers-guide/src/node/availability/bitfield-distribution.md index 97a5c14be3da..56379e0e09f0 100644 --- a/roadmap/implementers-guide/src/node/availability/bitfield-distribution.md +++ b/roadmap/implementers-guide/src/node/availability/bitfield-distribution.md @@ -4,7 +4,7 @@ Validators vote on the availability of a backed candidate by issuing signed bitf ## Protocol -`ProtocolId`: `b"bitd"` +`ProtocolId`: `b"bitd"`: `PeerSet`: `Validation` Input: [`BitfieldDistributionMessage`](../../types/overseer-protocol.md#bitfield-distribution-message) Output: diff --git a/roadmap/implementers-guide/src/node/backing/pov-distribution.md b/roadmap/implementers-guide/src/node/backing/pov-distribution.md index 4215386d2644..486ba96fccb0 100644 --- a/roadmap/implementers-guide/src/node/backing/pov-distribution.md +++ b/roadmap/implementers-guide/src/node/backing/pov-distribution.md @@ -4,7 +4,7 @@ This subsystem is responsible for distributing PoV blocks. For now, unified with ## Protocol -`ProtocolId`: `b"povd"` +`ProtocolId`: `b"povd"`, `PeerSet`: `Validation` Input: [`PoVDistributionMessage`](../../types/overseer-protocol.md#pov-distribution-message) @@ -18,7 +18,7 @@ Output: ## Functionality -This network protocol is responsible for distributing [`PoV`s](../../types/availability.md#proof-of-validity) by gossip. Since PoVs are heavy in practice, gossip is far from the most efficient way to distribute them. In the future, this should be replaced by a better network protocol that finds validators who have validated the block and connects to them directly. This protocol is descrbied +This network protocol is responsible for distributing [`PoV`s](../../types/availability.md#proof-of-validity) by gossip. Since PoVs are heavy in practice, gossip is far from the most efficient way to distribute them. In the future, this should be replaced by a better network protocol that finds validators who have validated the block and connects to them directly. This protocol is descrbied. This protocol is described in terms of "us" and our peers, with the understanding that this is the procedure that any honest node will run. It has the following goals: - We never have to buffer an unbounded amount of data diff --git a/roadmap/implementers-guide/src/node/backing/statement-distribution.md b/roadmap/implementers-guide/src/node/backing/statement-distribution.md index d05c68f7af70..b9a8914a9963 100644 --- a/roadmap/implementers-guide/src/node/backing/statement-distribution.md +++ b/roadmap/implementers-guide/src/node/backing/statement-distribution.md @@ -4,7 +4,7 @@ The Statement Distribution Subsystem is responsible for distributing statements ## Protocol -`ProtocolId`: `b"stmd"` +`ProtocolId`: `b"stmd"`, `PeerSet`: `Validation` Input: From a4c85e73b45860cf7d411b55c85806115505cd78 Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Thu, 23 Jul 2020 21:08:19 -0400 Subject: [PATCH 12/15] add `ConnectToValidators` message --- .../implementers-guide/src/node/utility/network-bridge.md | 7 +++++++ roadmap/implementers-guide/src/types/overseer-protocol.md | 5 +++++ 2 files changed, 12 insertions(+) diff --git a/roadmap/implementers-guide/src/node/utility/network-bridge.md b/roadmap/implementers-guide/src/node/utility/network-bridge.md index 766a7c4003b8..bd4a6a3d75a0 100644 --- a/roadmap/implementers-guide/src/node/utility/network-bridge.md +++ b/roadmap/implementers-guide/src/node/utility/network-bridge.md @@ -51,3 +51,10 @@ On `SendMessage` message: - Issue a corresponding `ProtocolMessage` to each listed peer with given protocol ID and bytes. [NBM]: ../../types/overseer-protocol.md#network-bridge-message + +On `ConnectToValidators` message: + +- Determine the DHT keys to use for each validator based on the relay-chain state and Runtime API. +- Recover the Peer IDs of the validators from the DHT. There may be more than one peer ID per validator. +- Accumulate all `(ValidatorId, PeerId)` pairs and send on the response channel. +- Feed all Peer IDs to the discovery utility the underlying network provides. diff --git a/roadmap/implementers-guide/src/types/overseer-protocol.md b/roadmap/implementers-guide/src/types/overseer-protocol.md index 2ff21dc40dee..6e8988acfbb3 100644 --- a/roadmap/implementers-guide/src/types/overseer-protocol.md +++ b/roadmap/implementers-guide/src/types/overseer-protocol.md @@ -170,6 +170,11 @@ enum NetworkBridgeMessage { ReportPeer(PeerSet, PeerId, cost_benefit: i32), /// Send a message to one or more peers on the given protocol ID. SendMessage(PeerSet, [PeerId], ProtocolId, Bytes), + /// Connect to peers who represent the given `ValidatorId`s at the given relay-parent. + /// + /// Also accepts a response channel by which the issuer can learn the `PeerId`s of those + /// validators. + ConnectToValidators(PeerSet, [ValidatorId], ResponseChannel<[(ValidatorId, PeerId)]>>), } ``` From 8535f73c2adb900d864cb64cc91842608166594d Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Thu, 23 Jul 2020 21:09:03 -0400 Subject: [PATCH 13/15] use ConnectToValidators in collator protocol --- .../implementers-guide/src/node/collators/collator-protocol.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index d10759a273ff..3fb250f0bfb0 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -100,7 +100,7 @@ For the purposes of actually distributing a collation, we need to be connected t As seen in the [Scheduler Module][SCH] of the runtime, validator groups are fixed for an entire session and their rotations across cores are predictable. Collators will want to do these things when attempting to distribute collations at a given relay-parent: * Determine which core the para collated-on is assigned to. * Determine the group on that core and the next group on that core. - * Issue a discovery request for the validators of the current group and the next group. + * Issue a discovery request for the validators of the current group and the next group with[`NetworkBridgeMessage`][NBM]`::ConnectToValidators`. Once connected to the relevant peers for the current group assigned to the core (transitively, the para), advertise the collation to any of them which advertise the relay-parent in their view (as provided by the [Network Bridge][NB]). If any respond with a request for the full collation, provide it. Upon receiving a view update from any of these peers which includes a relay-parent for which we have a collation that they will find relevant, advertise the collation to them if we haven't already. From f08fe27e9ba7af4f9926fb3c826a1305a88147b7 Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Tue, 28 Jul 2020 11:18:25 -0400 Subject: [PATCH 14/15] typo --- .../implementers-guide/src/node/collators/collator-protocol.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index 3fb250f0bfb0..c52fb11be0c2 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -167,7 +167,7 @@ The type of request we issue depends on the source of the advertisement. The adv If we receive a `ToOurSentry::RequestCollation` as a sentry, we need to act as a proxy and issue a request to the peer who made the particular advertisement being referenced. If it times out or the peer disconnects or sends invalid data, we need to respond to our validator with `ToOurValidator::RequestedCollation(..., None)`. If the request comes from a peer who is not our validator, we need to report and disconnect the peer, as this is a clear breach of protocol. -As a validator, once the collation has been fetched some other subsystem will inspect and do deeper validation of the collation. The subsystem will report to this subsystem with a [`CollatorProtocolMessage`][CPM]`::ReportCollator` or `NoteGoodCollation` message. In that case, if we are connected directly to the collator, we apply a cost to the `PeerId` associated with the collator. Otherwise, if we're connected to the collator via a sentry node, we issue a `ToOurSentryMesage` of the corresponding type. As the recipient of such a message on a sentry node, we carry out the requisite action. When handling a report that a collator is bad, we'd want to cancel all requests to that collator. +As a validator, once the collation has been fetched some other subsystem will inspect and do deeper validation of the collation. The subsystem will report to this subsystem with a [`CollatorProtocolMessage`][CPM]`::ReportCollator` or `NoteGoodCollation` message. In that case, if we are connected directly to the collator, we apply a cost to the `PeerId` associated with the collator. Otherwise, if we're connected to the collator via a sentry node, we issue a `ToOurSentryMessage` of the corresponding type. As the recipient of such a message on a sentry node, we carry out the requisite action. When handling a report that a collator is bad, we'd want to cancel all requests to that collator. ### Validator and Sentry nodes: Practicalities From 9880f43e469817afac5f51439f9e65f60728c15a Mon Sep 17 00:00:00 2001 From: Robert Habermeier Date: Thu, 30 Jul 2020 14:08:58 -0400 Subject: [PATCH 15/15] remove references to sentry nodes --- .../src/node/collators/collator-protocol.md | 127 ++---------------- 1 file changed, 14 insertions(+), 113 deletions(-) diff --git a/roadmap/implementers-guide/src/node/collators/collator-protocol.md b/roadmap/implementers-guide/src/node/collators/collator-protocol.md index c52fb11be0c2..526aab31aec8 100644 --- a/roadmap/implementers-guide/src/node/collators/collator-protocol.md +++ b/roadmap/implementers-guide/src/node/collators/collator-protocol.md @@ -27,31 +27,7 @@ This network protocol uses the `Collation` peer-set of the [`NetworkBridge`][NB] ```rust type RequestId = u64; -/// A message to our guarded validator, when acting as a sentry node. -enum ToOurValidatorMessage { - /// Forward an advertised collation to our validator. - AdvertisedCollation(Hash, CollatorId, ParaId), - /// A requested collation. `None` if the collator didn't provide it. - RequestedCollation(RequestId, Hash, Option<(CandidateReceipt, PoV)>), -} - -/// A message to our sentry node, when a validator. -enum ToOurSentryMessage { - /// Request a collation of the specific collator/validator pair via the - /// sentry. - RequestCollation(RequestId, Hash, CollatorId, ParaId), - /// Blacklist a collator and the peer representing it. - BlacklistCollator(CollatorId), - /// Note a good collation from a collator. - NoteGoodCollation(CollatorId), -} - enum WireMessage { - /// A wire message to our validator. - ToOurValidator(ToOurValidatorMessage), - /// A wire message to our sentry. - ToOurSentry(ToOurSentryMessage), - /// Declare the intent to advertise collations under a collator ID. Declare(CollatorId), /// Advertise a collation to a validator. Can only be sent once the peer has declared @@ -64,24 +40,20 @@ enum WireMessage { } ``` -One of the main necessities of this protocol is to deal with the validator/sentry node duality. Validators aren't expected to expose their node to public connections and instead expose sentry nodes on their behalf. These sentry nodes serve as relays, with protocol-level validation being done to prevent spam. - Since this protocol functions both for validators and collators, it is easiest to go through the protocol actions for each of them separately. -Validators, Collators, and sentry nodes: +Validators and collators. ```dot process digraph { c1 [shape=MSquare, label="Collator 1"]; c2 [shape=MSquare, label="Collator 2"]; - s1 [label = "Sentry Node"]; - v1 [shape=MSquare, label="Validator 1"]; v2 [shape=MSquare, label="Validator 2"]; - c1 -> s1 -> v1; - c2 -> s1; + c1 -> v1; c1 -> v2; + c2 -> v2; } ``` @@ -93,7 +65,7 @@ We keep track of the Para ID we are collating on as a collator. This starts as ` As with most other subsystems, we track the active leaves set by following `ActiveLeavesUpdate` signals. -For the purposes of actually distributing a collation, we need to be connected to the validators who are interested in collations on that `ParaId` at this point in time or sentry nodes that represent them. We assume that there is a discovery API for connecting to a set of validators. +For the purposes of actually distributing a collation, we need to be connected to the validators who are interested in collations on that `ParaId` at this point in time. We assume that there is a discovery API for connecting to a set of validators. > TODO: design & expose the discovery API not just for connecting to such peers but also to determine which of our current peers are validators. @@ -104,15 +76,13 @@ As seen in the [Scheduler Module][SCH] of the runtime, validator groups are fixe Once connected to the relevant peers for the current group assigned to the core (transitively, the para), advertise the collation to any of them which advertise the relay-parent in their view (as provided by the [Network Bridge][NB]). If any respond with a request for the full collation, provide it. Upon receiving a view update from any of these peers which includes a relay-parent for which we have a collation that they will find relevant, advertise the collation to them if we haven't already. -### Validators and Sentry Nodes - -Validators are not required to run with sentry nodes, so the code here needs to handle both the case where we run with and without. +### Validators -One of the main challenges of running with a sentry node is making sure that the state of the sentry is synchronized with the state of the validator node. We use `View` updates for that purpose. Sentry nodes are responsible for forwarding advertisements, requests, and responses to peers. +On the validator side of the protocol, validators need to accept incoming connections from collators. They should keep some peer slots open for accepting new speculative connections from collators and should disconnect from collators who are not relevant. ```dot process digraph G { - label = "Providing Collation via Sentry Node"; + label = "Declaring, advertising, and providing collations"; labelloc = "t"; rankdir = LR; @@ -124,14 +94,6 @@ digraph G { c1, c2 [label = ""]; } - subgraph cluster_sentry { - rank = same; - label = "Sentry"; - graph[style = border]; - - s1, s2, s3, s4 [rank = same, label = ""]; - } - subgraph cluster_validator { rank = same; label = "Validator"; @@ -140,86 +102,25 @@ digraph G { v1, v2 [label = ""]; } - c1 -> s1 -> v1 [label = "Advertise"]; + c1 -> v1 [label = "Declare and advertise"]; - v1 -> s2 -> c2 [label = "Request"]; + v1 -> c2 [label = "Request"]; - c2 -> s3 [xlabel = "Provide"]; - s3 -> v2 [label = "Provide"]; + c2 -> v2 [label = "Provide"]; - v2 -> s4 [xlabel = "Note Good/Bad"]; + v2 -> v2 [label = "Note Good/Bad"]; } ``` -At any step in the above diagram, the sentry node will be doing validation of the network statements sent by the collator and can report or disconnect the collator. - When peers connect to us, they can `Declare` that they represent a collator with given public key. Once they've declared that, they can begin to send advertisements of collations. The peers should not send us any advertisements for collations that are on a relay-parent outside of our view. -The protocol tracks advertisements received and the source of the advertisement. The advertisement source is either `Direct(PeerId)` or `Sentry(PeerId)`. We accept one advertisement per collator per source per relay-parent. - -If we're a sentry node, we relay advertisements to our guarded validator, but only those which reference a relay-parent in both our and our validator's view. This means that when we get a view update from our validator, we may need to forward it advertisements. - -If we're a validator, we will receive advertisements from all of our sentries. Although it's not expected, we may also receive advertisements directly from connected peers. This may occur when we are connected to a collator as a reserved peer. +The protocol tracks advertisements received and the source of the advertisement. The advertisement source is the `PeerId` of the peer who sent the message. We accept one advertisement per collator per source per relay-parent. As a validator, we will handle requests from other subsystems to fetch a collation on a specific `ParaId` and relay-parent. These requests are made with the [`CollatorProtocolMessage`][CPM]`::FetchCollation`. To do so, we need to first check if we have already gathered a collation on that `ParaId` and relay-parent. If not, we need to select one of the advertisements and issue a request for it. If we've already issued a request, we shouldn't issue another one until the first has returned. -The type of request we issue depends on the source of the advertisement. The advertisement may have a `Direct` source, in which case we issue a `WireMessage::RequestCollation`, or it may have a `Sentry` source, in which case we issue a `WireMessage::ToOurSentry::RequestCollation`. If the request times out, we need to note the collator as being unreliable and reduce its priority relative to other collators. And then make another request - until we get a response. - -If we receive a `ToOurSentry::RequestCollation` as a sentry, we need to act as a proxy and issue a request to the peer who made the particular advertisement being referenced. If it times out or the peer disconnects or sends invalid data, we need to respond to our validator with `ToOurValidator::RequestedCollation(..., None)`. If the request comes from a peer who is not our validator, we need to report and disconnect the peer, as this is a clear breach of protocol. - -As a validator, once the collation has been fetched some other subsystem will inspect and do deeper validation of the collation. The subsystem will report to this subsystem with a [`CollatorProtocolMessage`][CPM]`::ReportCollator` or `NoteGoodCollation` message. In that case, if we are connected directly to the collator, we apply a cost to the `PeerId` associated with the collator. Otherwise, if we're connected to the collator via a sentry node, we issue a `ToOurSentryMessage` of the corresponding type. As the recipient of such a message on a sentry node, we carry out the requisite action. When handling a report that a collator is bad, we'd want to cancel all requests to that collator. - -### Validator and Sentry nodes: Practicalities - -> Note: everything below this point is tentative and subject to change. +When acting on an advertisement, we issue a `WireMessage::RequestCollation`. If the request times out, we need to note the collator as being unreliable and reduce its priority relative to other collators. And then make another request - repeat until we get a response or the chain has moved on. -We use the `NetworkBridgeUpdate::OurViewChange` to track which heads we consider in our active leaves set and have communicated to peers. We ensure we keep records for all active leaves, mutating based on incoming `ActiveLeavesUpdate`s. The records have this form: - -```rust -struct CollationAdvertisements { - received_advertisements: Set<(Collator, Source)>, - advertisements: Map>>, -} -``` - -We also keep a record of what we are fetching for each leaf/relay-parent: - -```rust -struct CollationFetch { - // Fetch requests from subsystems - fetch_requests: Map, - fetch_inflight: Map, - fetched_candidates: Map, -} - -enum FetchState { - Pending([fn(CandidateReceipt, PoV)]), - Inflight(RequestId), - Completed(Collator, Source), -} - -struct FetchMetadata { - Collator, - Source, - ParaId, -} - -enum FetchDest { - Subsystem(fn(CandidateReceipt, PoV)), - Validator(RequestId, PeerId), -} -``` - -We also keep a record per-peer: - -```rust -struct PeerData { - role: ObservedRole, - live_requests: Map, // relay-parent the request is under. - collator_id: Option, - view: View, -} -``` +As a validator, once the collation has been fetched some other subsystem will inspect and do deeper validation of the collation. The subsystem will report to this subsystem with a [`CollatorProtocolMessage`][CPM]`::ReportCollator` or `NoteGoodCollation` message. In that case, if we are connected directly to the collator, we apply a cost to the `PeerId` associated with the collator and potentially disconnect or blacklist it. [PoV]: ../../types/availability.md#proofofvalidity [CPM]: ../../types/overseer-protocol.md#collatorprotocolmessage