Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Revisit Collator Selection Mechanism #1159

Closed
joepetrowski opened this issue Apr 9, 2022 · 8 comments
Closed

Revisit Collator Selection Mechanism #1159

joepetrowski opened this issue Apr 9, 2022 · 8 comments
Labels
J0-enhancement An additional feature request.

Comments

@joepetrowski
Copy link
Contributor

The Collator Selection pallet was a v0 to use Aura in Statemine at launch, but hasn't really evolved in any way. With the move of more important functionality into common good parachains, we should re-evaluate how collators are selected, rewarded, and punished.

Notes from @burdges:

Ain't so easy to use DOT staking on a parachain because if adversaries could choose between being a validator or a collator then adversaries could monopolize being collators so they could capture the parachain.

We'd prevent permanent capture with erasure coded state checked points, but even temporary capture makes essential parachains tricky.

It's avoidable if parachains have minimal state aside from their blocks, like elections.

We should consider what happens to the treasury, or even say staking, if they become unavailable for an epoch or so, due to capture. If it's fine, then we could randomly churn some nodes into being collators somehow, and ensure they could join using erasure coded check points.

@joepetrowski
Copy link
Contributor Author

Ain't so easy to use DOT staking on a parachain because if adversaries could choose between being a validator or a collator then adversaries could monopolize being collators so they could capture the parachain.

Right now the only response to that is via Invulnerables, collators set by governance.

@bkchr
Copy link
Member

bkchr commented Apr 9, 2022

Why do we need to use Aura anyway for common good parachains? We could use just use the "relay chain provided" consensus, aka free for all. Then there could not be any set of only malicious collators that isn't producing any blocks.

@burdges
Copy link

burdges commented Apr 9, 2022

We obtain soundness from the relay chain, like all parachains, but we do not believe the relay chain really provides liveness. In other words, we also need collators who maintain the full parachain state, and gossip network, separately from the relay chain, and then continue making blocks that include tx honestly enough.

If a parachains rents a slot, then we deem liveness to be their problem, not ours, although we could help via various mechanism. We're responsible for liveness of system parachains however, especially any essential ones.

We've worse yolo elsewhere in the system so we could deploy system parachains now without every desired liveness defense. We'll want different amounts or flavors of liveness defenses for different system parachains, based upon how critical they wind up being.

We've an absolutely huge design space here. We/I never charted out which defenses looks stronger, orthogonal, synergistic, etc., like I've done elsewhere (sassafras, etc.). Ideas:

We've complete capture with only malicious collators vs incomplete capture in which honest collators complain quickly. We donno how much incomplete capture should be handled via automated vs manual defenses, but complete capture kinda sucks because we could loose the underlying parachain state completely after 24-48 hours.

We risk complete capture if we just let a minority choose to be collators, and drive off everyone else. We thus avoid complete capture by voting sensibly. Invulnerables being set by relay chain governance sounds helpful against complete capture, likely this suffices right now. It's similarly helpful if all relay chain stake votes for collators, not only stake not voting for validators, but we do not want 1000 collators per system parachain, so either they elect into some pool, from which we assigned to system parachains in a safe manor, or else the election winds up fairly different.

We risk incomplete capture if malicious validators and collators prevent honest collators from maintaining the full parachain state. We know honest collators held the full parachain state recently, so they could catch up by fetching blocks from backers or approval checkers, or doing availability recovery. We must know how quickly they could catch up via these mechanisms however. We're happy if fast enough to track the chain state, but we'll need more complexity if the parachain moves more slowly.

It's likely our cleanest liveness solution for system parachains goes:

  1. Elect collators using all staked dots, initially via whatever methods, but later via voting similar to validators. We'll add some pooling design even later, which closely resembles multiple relay chains.
  2. Ensure collators track the parachain fast enough, even if they only recover from availability, approval checkers, etc. If necessary, we slow down system parachains by making them alternate slots on shared cores.

@bkchr
Copy link
Member

bkchr commented Apr 10, 2022

I repeat myself again, but why do we need to use any authoring logic that has a fixed set of collators assigned? We can just let anyone that is interested build blocks and then let the relay chain decide which one gets included.

@xlc
Copy link
Contributor

xlc commented Apr 10, 2022

wouldn't multiple forks causing performance issue / increase the time required to reach consensus?

@burdges
Copy link

burdges commented Apr 10, 2022

We must limit the number of parablocks proposed for each slot because the relay chain has limited backer resources. I'd expect unconstrained parablock production to first become proof-of-fastest-network, and then become pure kickbacks or favoritism by backers.

As discussed in #2203 we've no efficient XCMP message transport for any proof-of-work design, including proof-of-fastest-network. It'll impacts parablock transport too, even if done via availability recovery.

In principle, I think kickbacks or favoritism by backers provide acceptable properties from several perspectives, assuming we like other collators fetch blocks via availability recovery, and (b) backing groups rotate quickly enough. Yet, favoritism reflects poorly upon polkadot, interacts poorly with XCMP & components, encourages further attacks among collators, and overal creates problematic mercenary attitude.


If we do expect collators to run availability recovery, then maybe we should separate the actual parablock from the PoV extensions in availability?


We'll ultimately want most parachains to produce blocks using sassafras because doing so should reduce forks and sassafras can operate without memepools, which helps parachains save bandwidth and CPU time, reduces MEV, and do block producer level privacy tricks.

@the-right-joyce the-right-joyce added J0-enhancement An additional feature request. and removed F8-enhancement 🎁 labels Aug 12, 2022
@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/economic-model-for-system-para-collators/1010/1

@joepetrowski
Copy link
Contributor Author

Addressed more concretely in the roadmap and linked issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
J0-enhancement An additional feature request.
Projects
None yet
Development

No branches or pull requests

6 participants