-
Notifications
You must be signed in to change notification settings - Fork 379
Explicit recovery for announced candidate blocks #1891
Explicit recovery for announced candidate blocks #1891
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recover_candidate
still calls pending_candidate.remove
. This shouldn't be done.
Co-authored-by: Bastian Köcher <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All review suggestions should have been addressed.
Regarding the renaming of pending_candidates
I went for just candidates
.
The semantic here is that these are blocks announced as backed by the relay chain and that are candidates of becoming part of the (finalized) canonical blocks for the relay chain.
Candidates are removed from the list when are finalized
@@ -187,6 +214,68 @@ fn follow_new_best_works() { | |||
}); | |||
} | |||
|
|||
#[test] | |||
fn follow_new_best_with_dummy_recovery_works() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test to excercise explicit best block recovery.
Even though currently we are using as a recovery mechanism the PoVRecovery
subsystem, the motivation of this test is to also show how the recovery can be generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
/// Type of recovery to trigger. | ||
#[derive(Debug, PartialEq)] | ||
pub enum RecoveryKind { | ||
/// Single block recovery. | ||
Simple, | ||
/// Full ancestry recovery. | ||
Full, | ||
} | ||
|
||
/// Structure used to trigger an explicit recovery request via `PoVRecovery`. | ||
pub struct RecoveryRequest<Block: BlockT> { | ||
/// Hash of the last block to recover. | ||
pub hash: Block::Hash, | ||
/// Recovery delay range. Randomizing the start of the recovery within this interval | ||
/// can be used to prevent self-DOSing if the recovery request is part of a | ||
/// distributed protocol and there is the possibility that multiple actors are | ||
/// requiring to perform the recovery action at approximately the same time. | ||
pub delay: RecoveryDelay, | ||
/// Recovery type. | ||
pub kind: RecoveryKind, | ||
} | ||
|
||
/// The delay between observing an unknown block and triggering the recovery of a block. | ||
#[derive(Clone, Copy)] | ||
pub struct RecoveryDelay { | ||
/// Start recovering after `min` delay. | ||
pub min: Duration, | ||
/// Start recovering before `max` delay. | ||
pub max: Duration, | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to have them defined here? I think they should stay in pov-recovery
and then we import this crate in consensus/common
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment not really, as you correctly suggested we can just import pov-recovery
in consensus/common
.
The rationale behind moving the basic types of block recovery out of PoCRecovery was to abstract the recovery mechanism.
When the consensus code finds that there is a new best it will trigger a recovery by sending a message (RecoveryRequest
) via an mpsc channel. The recovery action can then potentially happen via any mean (dummy examples: torrent, ipfs, a distributed filesystem, local filesystem, ...).
I know, this is not the case now. But in this way we don't bound cumulus user of our consensus to one possibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I know, but I still think this is fine for now. I also know that these primitive crates are really not expressive, but this one should not host these things. However, in general I don't see any need right now to make all this stuff pluggable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will put these structs it in the pov-recovery
crate in the other PR
candidate.waiting_recovery = true; | ||
to_recover.push(hash); | ||
}, | ||
Ok(_) => break true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok(_) => break true, | |
Ok(BlockStatus::InChain) => break true, |
Seeing this error: |
Actually tested with the base branch, not this PR. Logs here: https://grafana.parity-mgmt.parity.io/goto/2evUx3FVk?orgId=1 |
@sandreim you fucked up the master merge! 🙈 |
Damn, I'll revert this, and force push ... |
The CI pipeline was cancelled due to failure one of the required jobs. |
90b4b8e
to
fa1e0e8
Compare
This PR augments #1559 with block recovery via PoV.
(target branch has been set to #1559 branch in order to better highlight the diff and have a more focused conversation)
A recovery channel end is passed to the parachain consensus worker and it is used to trigger/stimulate block recovery whenever the best parachain block is not locally present yet.
The rx side of the channel can be whatever and not strictly necessarily the PoV recovery mechanism.
This code is very unlikely from being triggered in practice. But better safe than sorry.
The tests were not aligned yet to some of the new interfaces, so CI failure is expected. Feedback first :-)