Revisit Collator Selection Mechanism #1159

joepetrowski · 2022-04-09T05:33:39Z

The Collator Selection pallet was a v0 to use Aura in Statemine at launch, but hasn't really evolved in any way. With the move of more important functionality into common good parachains, we should re-evaluate how collators are selected, rewarded, and punished.

Notes from @burdges:

Ain't so easy to use DOT staking on a parachain because if adversaries could choose between being a validator or a collator then adversaries could monopolize being collators so they could capture the parachain.

We'd prevent permanent capture with erasure coded state checked points, but even temporary capture makes essential parachains tricky.

It's avoidable if parachains have minimal state aside from their blocks, like elections.

We should consider what happens to the treasury, or even say staking, if they become unavailable for an epoch or so, due to capture. If it's fine, then we could randomly churn some nodes into being collators somehow, and ensure they could join using erasure coded check points.

joepetrowski · 2022-04-09T05:37:31Z

Ain't so easy to use DOT staking on a parachain because if adversaries could choose between being a validator or a collator then adversaries could monopolize being collators so they could capture the parachain.

Right now the only response to that is via Invulnerables, collators set by governance.

bkchr · 2022-04-09T06:32:21Z

Why do we need to use Aura anyway for common good parachains? We could use just use the "relay chain provided" consensus, aka free for all. Then there could not be any set of only malicious collators that isn't producing any blocks.

burdges · 2022-04-09T07:44:45Z

We obtain soundness from the relay chain, like all parachains, but we do not believe the relay chain really provides liveness. In other words, we also need collators who maintain the full parachain state, and gossip network, separately from the relay chain, and then continue making blocks that include tx honestly enough.

If a parachains rents a slot, then we deem liveness to be their problem, not ours, although we could help via various mechanism. We're responsible for liveness of system parachains however, especially any essential ones.

We've worse yolo elsewhere in the system so we could deploy system parachains now without every desired liveness defense. We'll want different amounts or flavors of liveness defenses for different system parachains, based upon how critical they wind up being.

We've an absolutely huge design space here. We/I never charted out which defenses looks stronger, orthogonal, synergistic, etc., like I've done elsewhere (sassafras, etc.). Ideas:

We've complete capture with only malicious collators vs incomplete capture in which honest collators complain quickly. We donno how much incomplete capture should be handled via automated vs manual defenses, but complete capture kinda sucks because we could loose the underlying parachain state completely after 24-48 hours.

We risk complete capture if we just let a minority choose to be collators, and drive off everyone else. We thus avoid complete capture by voting sensibly. Invulnerables being set by relay chain governance sounds helpful against complete capture, likely this suffices right now. It's similarly helpful if all relay chain stake votes for collators, not only stake not voting for validators, but we do not want 1000 collators per system parachain, so either they elect into some pool, from which we assigned to system parachains in a safe manor, or else the election winds up fairly different.

We risk incomplete capture if malicious validators and collators prevent honest collators from maintaining the full parachain state. We know honest collators held the full parachain state recently, so they could catch up by fetching blocks from backers or approval checkers, or doing availability recovery. We must know how quickly they could catch up via these mechanisms however. We're happy if fast enough to track the chain state, but we'll need more complexity if the parachain moves more slowly.

It's likely our cleanest liveness solution for system parachains goes:

Elect collators using all staked dots, initially via whatever methods, but later via voting similar to validators. We'll add some pooling design even later, which closely resembles multiple relay chains.
Ensure collators track the parachain fast enough, even if they only recover from availability, approval checkers, etc. If necessary, we slow down system parachains by making them alternate slots on shared cores.

bkchr · 2022-04-10T06:40:37Z

I repeat myself again, but why do we need to use any authoring logic that has a fixed set of collators assigned? We can just let anyone that is interested build blocks and then let the relay chain decide which one gets included.

xlc · 2022-04-10T07:00:43Z

wouldn't multiple forks causing performance issue / increase the time required to reach consensus?

burdges · 2022-04-10T10:47:19Z

We must limit the number of parablocks proposed for each slot because the relay chain has limited backer resources. I'd expect unconstrained parablock production to first become proof-of-fastest-network, and then become pure kickbacks or favoritism by backers.

As discussed in #2203 we've no efficient XCMP message transport for any proof-of-work design, including proof-of-fastest-network. It'll impacts parablock transport too, even if done via availability recovery.

In principle, I think kickbacks or favoritism by backers provide acceptable properties from several perspectives, assuming we like other collators fetch blocks via availability recovery, and (b) backing groups rotate quickly enough. Yet, favoritism reflects poorly upon polkadot, interacts poorly with XCMP & components, encourages further attacks among collators, and overal creates problematic mercenary attitude.

If we do expect collators to run availability recovery, then maybe we should separate the actual parablock from the PoV extensions in availability?

We'll ultimately want most parachains to produce blocks using sassafras because doing so should reduce forks and sassafras can operate without memepools, which helps parachains save bandwidth and CPU time, reduces MEV, and do block producer level privacy tricks.

Polkadot-Forum · 2022-11-08T07:05:05Z

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/economic-model-for-system-para-collators/1010/1

joepetrowski · 2023-08-23T05:21:56Z

Addressed more concretely in the roadmap and linked issues.

joepetrowski added the F8-enhancement 🎁 label Apr 9, 2022

joepetrowski mentioned this issue Apr 9, 2022

META: Moving the Treasury Off the Relay Chain paritytech/polkadot-sdk#98

Open

the-right-joyce added J0-enhancement An additional feature request. and removed F8-enhancement 🎁 labels Aug 12, 2022

joepetrowski mentioned this issue Jun 25, 2023

System Parachain Collator Decentralization paritytech/roadmap#34

Closed

joepetrowski closed this as completed Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisit Collator Selection Mechanism #1159

Revisit Collator Selection Mechanism #1159

joepetrowski commented Apr 9, 2022

joepetrowski commented Apr 9, 2022

bkchr commented Apr 9, 2022

burdges commented Apr 9, 2022

bkchr commented Apr 10, 2022

xlc commented Apr 10, 2022

burdges commented Apr 10, 2022

Polkadot-Forum commented Nov 8, 2022

joepetrowski commented Aug 23, 2023

Revisit Collator Selection Mechanism #1159

Revisit Collator Selection Mechanism #1159

Comments

joepetrowski commented Apr 9, 2022

joepetrowski commented Apr 9, 2022

bkchr commented Apr 9, 2022

burdges commented Apr 9, 2022

bkchr commented Apr 10, 2022

xlc commented Apr 10, 2022

burdges commented Apr 10, 2022

Polkadot-Forum commented Nov 8, 2022

joepetrowski commented Aug 23, 2023