Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Explicit recovery for announced candidate blocks #1891

Merged

Conversation

davxy
Copy link
Member

@davxy davxy commented Nov 17, 2022

This PR augments #1559 with block recovery via PoV.

(target branch has been set to #1559 branch in order to better highlight the diff and have a more focused conversation)


A recovery channel end is passed to the parachain consensus worker and it is used to trigger/stimulate block recovery whenever the best parachain block is not locally present yet.

The rx side of the channel can be whatever and not strictly necessarily the PoV recovery mechanism.

This code is very unlikely from being triggered in practice. But better safe than sorry.


The tests were not aligned yet to some of the new interfaces, so CI failure is expected. Feedback first :-)

@davxy davxy marked this pull request as ready for review November 17, 2022 17:54
@davxy davxy requested review from bkchr, skunert and a team November 17, 2022 17:54
@davxy davxy self-assigned this Nov 17, 2022
@davxy davxy added B0-silent Changes should not be mentioned in any release notes A0-please_review Pull request needs code review. C1-low PR touches the given topic and has a low impact on builders. D2-notlive 💤 PR contains changes in a runtime directory that is not deployed to a chain that requires an audit. labels Nov 17, 2022
@davxy davxy changed the title Implemented explicit recovery for announced candidate blocks Explicit recovery for announced candidate blocks Nov 17, 2022
client/pov-recovery/src/lib.rs Outdated Show resolved Hide resolved
client/pov-recovery/src/lib.rs Show resolved Hide resolved
client/pov-recovery/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recover_candidate still calls pending_candidate.remove. This shouldn't be done.

client/pov-recovery/src/lib.rs Outdated Show resolved Hide resolved
client/pov-recovery/src/lib.rs Outdated Show resolved Hide resolved
client/pov-recovery/src/lib.rs Outdated Show resolved Hide resolved
client/pov-recovery/src/lib.rs Show resolved Hide resolved
client/pov-recovery/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Member Author

@davxy davxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All review suggestions should have been addressed.

Regarding the renaming of pending_candidates I went for just candidates.

The semantic here is that these are blocks announced as backed by the relay chain and that are candidates of becoming part of the (finalized) canonical blocks for the relay chain.

Candidates are removed from the list when are finalized

@@ -187,6 +214,68 @@ fn follow_new_best_works() {
});
}

#[test]
fn follow_new_best_with_dummy_recovery_works() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test to excercise explicit best block recovery.

Even though currently we are using as a recovery mechanism the PoVRecovery subsystem, the motivation of this test is to also show how the recovery can be generic.

@davxy davxy requested review from skunert, bkchr and a team December 3, 2022 19:29
client/pov-recovery/src/lib.rs Outdated Show resolved Hide resolved
client/pov-recovery/src/lib.rs Show resolved Hide resolved
Copy link
Contributor

@skunert skunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Comment on lines +198 to +228
/// Type of recovery to trigger.
#[derive(Debug, PartialEq)]
pub enum RecoveryKind {
/// Single block recovery.
Simple,
/// Full ancestry recovery.
Full,
}

/// Structure used to trigger an explicit recovery request via `PoVRecovery`.
pub struct RecoveryRequest<Block: BlockT> {
/// Hash of the last block to recover.
pub hash: Block::Hash,
/// Recovery delay range. Randomizing the start of the recovery within this interval
/// can be used to prevent self-DOSing if the recovery request is part of a
/// distributed protocol and there is the possibility that multiple actors are
/// requiring to perform the recovery action at approximately the same time.
pub delay: RecoveryDelay,
/// Recovery type.
pub kind: RecoveryKind,
}

/// The delay between observing an unknown block and triggering the recovery of a block.
#[derive(Clone, Copy)]
pub struct RecoveryDelay {
/// Start recovering after `min` delay.
pub min: Duration,
/// Start recovering before `max` delay.
pub max: Duration,
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to have them defined here? I think they should stay in pov-recovery and then we import this crate in consensus/common

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment not really, as you correctly suggested we can just import pov-recoveryin consensus/common.

The rationale behind moving the basic types of block recovery out of PoCRecovery was to abstract the recovery mechanism.

When the consensus code finds that there is a new best it will trigger a recovery by sending a message (RecoveryRequest) via an mpsc channel. The recovery action can then potentially happen via any mean (dummy examples: torrent, ipfs, a distributed filesystem, local filesystem, ...).

I know, this is not the case now. But in this way we don't bound cumulus user of our consensus to one possibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know, but I still think this is fine for now. I also know that these primitive crates are really not expressive, but this one should not host these things. However, in general I don't see any need right now to make all this stuff pluggable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will put these structs it in the pov-recovery crate in the other PR

candidate.waiting_recovery = true;
to_recover.push(hash);
},
Ok(_) => break true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Ok(_) => break true,
Ok(BlockStatus::InChain) => break true,

client/pov-recovery/src/lib.rs Outdated Show resolved Hide resolved
client/pov-recovery/src/lib.rs Show resolved Hide resolved
@sandreim sandreim requested review from a team and chevdor as code owners December 9, 2022 16:00
@sandreim
Copy link
Contributor

sandreim commented Dec 9, 2022

Seeing this error: [Parachain] XXX Unable getting route to any leaf from 0xeec619ba0097d7aa2c1278574ac6d986cd56d1418429bd2b8bd8528e1ac4d6bf (this is a bug)

@sandreim
Copy link
Contributor

sandreim commented Dec 9, 2022

Seeing this error: [Parachain] XXX Unable getting route to any leaf from 0xeec619ba0097d7aa2c1278574ac6d986cd56d1418429bd2b8bd8528e1ac4d6bf (this is a bug)

Actually tested with the base branch, not this PR. Logs here: https://grafana.parity-mgmt.parity.io/goto/2evUx3FVk?orgId=1

@bkchr
Copy link
Member

bkchr commented Dec 9, 2022

@sandreim you fucked up the master merge! 🙈

@sandreim
Copy link
Contributor

sandreim commented Dec 9, 2022

@sandreim you fucked up the master merge! 🙈

Damn, I'll revert this, and force push ...

@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: test-linux-stable
Logs: https://gitlab.parity.io/parity/mirrors/cumulus/-/jobs/2141586

@davxy davxy force-pushed the davxy/remove-leaves-portable-with-recovery branch from 90b4b8e to fa1e0e8 Compare December 9, 2022 16:53
@davxy davxy merged commit 408580e into davxy/remove-leaves-portable Dec 9, 2022
@davxy davxy deleted the davxy/remove-leaves-portable-with-recovery branch December 9, 2022 16:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D2-notlive 💤 PR contains changes in a runtime directory that is not deployed to a chain that requires an audit.
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

5 participants