Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix checkpoint block potentially not getting backfilled into DB #5863

Merged
merged 1 commit into from
Feb 9, 2024

Conversation

etan-status
Copy link
Contributor

When using checkpoint sync, only checkpoint state is available, block is not downloaded and backfilled later.

dag.backfill tracks latest filled slot, and latest parent_root for which no block has been synced yet.

In checkpoint sync, this assumption is broken, because there, the start dag.backfill.slot is set based on checkpoint state slot, and the block is also not available.

However, sync manager in backward mode also requests dag.backfill.slot and block_clearance then backfills the checkpoint block once it is synced. But, there is no guarantee that a peer ever sends us that block. They could send us all parent blocks and solely omit the checkpoint block itself. In that situation, we would accept the parent blocks and advance dag.backfill, and subsequently never request the checkpoint block again, resulting in gap inside blocks DB that is never filled.

To mitigate that, the assumption is restored that dag.backfill.slot is the latest filled slot, and dag.backfill.parent_root is the next block that needs to be synced. By setting slot to tail.slot + 1 and parent_root to tail.root, we put a fake summary into dag.backfill so that block_clearance only proceeds once checkpoint block exists.

When using checkpoint sync, only checkpoint state is available, block is
not downloaded and backfilled later.

`dag.backfill` tracks latest filled `slot`, and latest `parent_root` for
which no block has been synced yet.

In checkpoint sync, this assumption is broken, because there, the start
`dag.backfill.slot` is set based on checkpoint state slot, and the block
is also not available.

However, sync manager in backward mode also requests `dag.backfill.slot`
and `block_clearance` then backfills the checkpoint block once it is
synced. But, there is no guarantee that a peer ever sends us that block.
They could send us all parent blocks and solely omit the checkpoint
block itself. In that situation, we would accept the parent blocks and
advance `dag.backfill`, and subsequently never request the checkpoint
block again, resulting in gap inside blocks DB that is never filled.

To mitigate that, the assumption is restored that `dag.backfill.slot`
is the latest filled `slot`, and `dag.backfill.parent_root` is the next
block that needs to be synced. By setting `slot` to `tail.slot + 1` and
`parent_root` to `tail.root`, we put a fake summary into `dag.backfill`
so that `block_clearance` only proceeds once checkpoint block exists.
Copy link

github-actions bot commented Feb 7, 2024

Unit Test Results

         9 files  ±0    1 104 suites  ±0   27m 51s ⏱️ +6s
  4 230 tests +1    3 883 ✔️ +1  347 💤 ±0  0 ±0 
16 885 runs  +3  16 487 ✔️ +3  398 💤 ±0  0 ±0 

Results for commit a32af0c. ± Comparison against base commit 464ff68.

@arnetheduck arnetheduck merged commit 65e6f89 into unstable Feb 9, 2024
12 checks passed
@arnetheduck arnetheduck deleted the dev/etan/ts-backfill branch February 9, 2024 10:20
tersec added a commit that referenced this pull request Feb 9, 2024
tersec added a commit that referenced this pull request Feb 9, 2024
etan-status added a commit that referenced this pull request Feb 9, 2024
@etan-status
Copy link
Contributor Author

etan-status added a commit that referenced this pull request Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants