Lookahead collator: do not try to build upon an unknown block #4036

s0me0ne-unkn0wn · 2024-04-09T08:02:51Z

No description provided.

skunert · 2024-04-09T08:13:12Z

cumulus/client/consensus/aura/src/collators/lookahead.rs

@@ -315,13 +315,14 @@ where
 			let mut parent_header = initial_parent.header;
 			let overseer_handle = &mut params.overseer_handle;

-			// We mainly call this to inform users at genesis if there is a mismatch with the
-			// on-chain data.
-			collator.collator_service().check_block_status(parent_hash, &parent_header);


Could we not directly call continue here? Why move it to the loop below?
Parent hash and parent header below will only change if we built a new block ortselves, so that should always return true in subsequent calls anyway (or I am missing something).

parent_hash and parent_header get updated at the bottom of the inner loop. On the other hand, they are updated to values we get after a successful collation, which are unlikely to be unknown or bad 🤔 Need to think a bit more, thanks for pointing out!

parent_hash and parent_header get updated at the bottom of the inner loop. On the other hand, they are updated to values we get after a successful collation, which are unlikely to be unknown or bad 🤔 Need to think a bit more, thanks for pointing out!

Exactly, that is what I meant. We build a block successfully and set parent to it. But since we built it ourselves it will for sure be in-chain. If the first parent itself if messed up we will abort before entering the 0..2 loop.

sandreim · 2024-04-09T08:56:41Z

cumulus/client/consensus/aura/src/collators/lookahead.rs

 			// This needs to change to support elastic scaling, but for continuously
 			// scheduled chains this ensures that the backlog will grow steadily.
 			for n_built in 0..2 {
+				// Do not try to build upon an unknown, pruned or bad block
+				if !collator.collator_service().check_block_status(parent_hash, &parent_header) {
+					break;


Does this happen often ? I'd still add a trace here just in case.

Not often at all, and check_block_status() has very detailed debug-level messages inside.

sandreim · 2024-04-09T11:03:41Z

I think it depends on the exact reason for block not being present. If for some reason we didn't see it yet but it's present on the relay chain we should just fetch it or expect that it is announced by another collator. @skunert does this make any sense ?

skunert · 2024-04-09T12:02:17Z

I did not think enough about this first time I checked here, I am not sure how this scenario should even occur. The parent we are checking in the code was found via find_potential_parents, which starts at the included block and checks all child branches to find a parent. So parents that we find should generally be in the local db. @s0me0ne-unkn0wn did you see this problem in the wild or was it added as precaution?

I think it depends on the exact reason for block not being present. If for some reason we didn't see it yet but it's present on the relay chain we should just fetch it or expect that it is announced by another collator. @skunert does this make any sense ?

So if we see a new pending para block in the relay chain and it is not announced anywhere soon, we will automatically start fetching it with the pov-recovery mechanism. This already happens.

Edit: Aah, we assume that included and pending are locally available, which might not be the case. So change makes sense.

s0me0ne-unkn0wn · 2024-04-09T13:44:35Z

@sandreim, if you could point me out to the interfaces that would help me achieve that, that could be a great addition. Still, I feel like it is better to have it as a follow-up. Testing the soundness of such a solution could be a real hell IIUC 😟

@skunert I came across that during the live testing on Kusama (paritytech/devops#3261). It looked like this:

skunert · 2024-04-09T14:09:43Z

@skunert I came across that during the live testing on Kusama (paritytech/devops#3261). It looked like this:

Okay makes sense, we see in the logs that this block was imported right after we errored. So yeah in that situation makes sense to skip.

bkchr · 2024-04-09T18:45:30Z

Generally weird that the block didn't make it in time over the parachain network to the collator. However, this stuff is not predictable. Also given that there are now that many different collator implementations are running together, makes it more complicated to reason about on what happened.

s0me0ne-unkn0wn · 2024-04-11T13:32:23Z

Merged into #3630

Do not try to build upon an unknown block

b4a6e13

s0me0ne-unkn0wn added R0-silent Changes should not be mentioned in any release notes T9-cumulus This PR/Issue is related to cumulus. labels Apr 9, 2024

s0me0ne-unkn0wn requested review from eskimor, skunert, bkchr and alexggh April 9, 2024 08:02

skunert reviewed Apr 9, 2024

View reviewed changes

skunert approved these changes Apr 9, 2024

View reviewed changes

alexggh approved these changes Apr 9, 2024

View reviewed changes

sandreim approved these changes Apr 9, 2024

View reviewed changes

Move check into outer loop

73469a8

bkchr approved these changes Apr 9, 2024

View reviewed changes

bkchr added this pull request to the merge queue Apr 9, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 9, 2024

s0me0ne-unkn0wn added 2 commits April 10, 2024 00:38

Merge branch 'master' into s0me0ne/no-build-upon-unknown-block

49bbf49

Merge branch 'master' into s0me0ne/no-build-upon-unknown-block

19c1113

s0me0ne-unkn0wn added this pull request to the merge queue Apr 10, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 10, 2024

s0me0ne-unkn0wn added this pull request to the merge queue Apr 10, 2024

github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Apr 10, 2024

s0me0ne-unkn0wn added this pull request to the merge queue Apr 10, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 10, 2024

Merge branch 'master' into s0me0ne/no-build-upon-unknown-block

36b9ea4

s0me0ne-unkn0wn enabled auto-merge April 11, 2024 07:42

s0me0ne-unkn0wn mentioned this pull request Apr 11, 2024

Enable mainnet system parachains to use async backing-enabled collator #3630

Merged

s0me0ne-unkn0wn disabled auto-merge April 11, 2024 13:18

s0me0ne-unkn0wn closed this Apr 11, 2024

s0me0ne-unkn0wn deleted the s0me0ne/no-build-upon-unknown-block branch April 11, 2024 13:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lookahead collator: do not try to build upon an unknown block #4036

Lookahead collator: do not try to build upon an unknown block #4036

s0me0ne-unkn0wn commented Apr 9, 2024

skunert Apr 9, 2024 •

edited

Loading

s0me0ne-unkn0wn Apr 9, 2024

skunert Apr 9, 2024

sandreim Apr 9, 2024

s0me0ne-unkn0wn Apr 9, 2024

sandreim commented Apr 9, 2024

skunert commented Apr 9, 2024 •

edited

Loading

s0me0ne-unkn0wn commented Apr 9, 2024

skunert commented Apr 9, 2024

bkchr commented Apr 9, 2024

s0me0ne-unkn0wn commented Apr 11, 2024

Lookahead collator: do not try to build upon an unknown block #4036

Lookahead collator: do not try to build upon an unknown block #4036

Conversation

s0me0ne-unkn0wn commented Apr 9, 2024

skunert Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

s0me0ne-unkn0wn Apr 9, 2024

Choose a reason for hiding this comment

skunert Apr 9, 2024

Choose a reason for hiding this comment

sandreim Apr 9, 2024

Choose a reason for hiding this comment

s0me0ne-unkn0wn Apr 9, 2024

Choose a reason for hiding this comment

sandreim commented Apr 9, 2024

skunert commented Apr 9, 2024 • edited Loading

s0me0ne-unkn0wn commented Apr 9, 2024

skunert commented Apr 9, 2024

bkchr commented Apr 9, 2024

s0me0ne-unkn0wn commented Apr 11, 2024

skunert Apr 9, 2024 •

edited

Loading

skunert commented Apr 9, 2024 •

edited

Loading