Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - v1.1.6 Fork Choice changes #2822

Closed

Conversation

realbigsean
Copy link
Member

@realbigsean realbigsean commented Nov 18, 2021

Issue Addressed

Resolves: #2741
Includes: #2853 so that we can get ssz static tests passing here on v1.1.6. If we want to merge that first, we can make this diff slightly smaller

Proposed Changes

  • Changes the justified_epoch and finalized_epoch in the ProtoArrayNode each to an Option<Checkpoint>. The Option is necessary only for the migration, so not ideal. But does allow us to add a default logic to None on these fields during the database migration.
  • Adds a database migration from a legacy fork choice struct to the new one, search for all necessary block roots in fork choice by iterating through blocks in the db.
  • updates related to always atomically update justified and finalized ethereum/consensus-specs#2727
    • We will have to update the persisted forkchoice to make sure the justified checkpoint stored is correct according to the updated fork choice logic. This boils down to setting the forkchoice store's justified checkpoint to the justified checkpoint of the block that advanced the finalized checkpoint to the current one.
    • AFAICT there's no migration steps necessary for the update to allow applying attestations from prior blocks, but would appreciate confirmation on that
  • I updated the consensus spec tests to v1.1.6 here, but they will fail until we also implement the proposer score boost updates. I confirmed that the previously failing scenario new_finalized_slot_is_justified_checkpoint_ancestor will now pass after the boost updates, but haven't confirmed all tests will pass because I just quickly stubbed out the proposer boost test scenario formatting.
  • This PR now also includes proposer boosting Proposer LMD Score Boosting ethereum/consensus-specs#2730

Additional Info

I realized checking justified and finalized roots in fork choice makes it more likely that we trigger this bug: ethereum/consensus-specs#2727

It's possible the combination of justified checkpoint and finalized checkpoint in the forkchoice store is different from in any block in fork choice. So when trying to startup our store's justified checkpoint seems invalid to the rest of fork choice (but it should be valid). When this happens we get an InvalidBestNode error and fail to start up. So I'm including that bugfix in this branch.

Todo:

@realbigsean realbigsean added the work-in-progress PR is a work-in-progress label Nov 18, 2021
@realbigsean realbigsean force-pushed the justified-checkpoint-root branch from 1707807 to 227462d Compare November 29, 2021 18:33
@realbigsean realbigsean changed the base branch from unstable to kintsugi November 29, 2021 18:37
@realbigsean realbigsean added ready-for-review The code is ready for review and removed work-in-progress PR is a work-in-progress labels Nov 29, 2021
@realbigsean
Copy link
Member Author

Still failing an EF test. This implementation differs from the spec a bit in how this issue is handled: ethereum/consensus-specs#2757

So marking this as blocked for now

@realbigsean realbigsean added blocked and removed ready-for-review The code is ready for review labels Dec 1, 2021
@realbigsean realbigsean changed the title Justified checkpoint root v1.1.6 Spec changes Dec 1, 2021
@paulhauner paulhauner deleted the branch sigp:unstable December 2, 2021 05:51
@paulhauner paulhauner closed this Dec 2, 2021
@paulhauner
Copy link
Member

Github closed this automatically, not me. Sorry!

@paulhauner paulhauner reopened this Dec 2, 2021
@paulhauner paulhauner changed the base branch from kintsugi to unstable December 2, 2021 05:58
@michaelsproul michaelsproul added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-merge This PR is ready to merge. labels Dec 13, 2021
@realbigsean
Copy link
Member Author

realbigsean commented Dec 13, 2021

I really liked the used of superstruct and derivative to make things cleaner.

Props to @michaelsproul for this!

realbigsean and others added 6 commits December 13, 2021 10:54
@realbigsean
Copy link
Member Author

Ok all comments addressed! Diff here: https://github.com/sigp/lighthouse/compare/e266fe1539760faaaefbdcb8ddcd4f392207ec77..9d4692f6c

@realbigsean realbigsean added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Dec 13, 2021
Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge time!

bors r+

@michaelsproul michaelsproul added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Dec 13, 2021
bors bot pushed a commit that referenced this pull request Dec 13, 2021
## Issue Addressed

Resolves: #2741
Includes: #2853 so that we can get ssz static tests passing here on v1.1.6. If we want to merge that first, we can make this diff slightly smaller

## Proposed Changes

- Changes the `justified_epoch` and `finalized_epoch` in the `ProtoArrayNode` each to an `Option<Checkpoint>`. The `Option` is necessary only for the migration, so not ideal. But does allow us to add a default logic to `None` on these fields during the database migration.
- Adds a database migration from a legacy fork choice struct to the new one, search for all necessary block roots in fork choice by iterating through blocks in the db.
- updates related to ethereum/consensus-specs#2727
  -  We will have to update the persisted forkchoice to make sure the justified checkpoint stored is correct according to the updated fork choice logic. This boils down to setting the forkchoice store's justified checkpoint to the justified checkpoint of the block that advanced the finalized checkpoint to the current one. 
  - AFAICT there's no migration steps necessary for the update to allow applying attestations from prior blocks, but would appreciate confirmation on that
- I updated the consensus spec tests to v1.1.6 here, but they will fail until we also implement the proposer score boost updates. I confirmed that the previously failing scenario `new_finalized_slot_is_justified_checkpoint_ancestor` will now pass after the boost updates, but haven't confirmed _all_ tests will pass because I just quickly stubbed out the proposer boost test scenario formatting.
- This PR now also includes proposer boosting ethereum/consensus-specs#2730

## Additional Info
I realized checking justified and finalized roots in fork choice makes it more likely that we trigger this bug: ethereum/consensus-specs#2727

It's possible the combination of justified checkpoint and finalized checkpoint in the forkchoice store is different from in any block in fork choice. So when trying to startup our store's justified checkpoint seems invalid to the rest of fork choice (but it should be valid). When this happens we get an `InvalidBestNode` error and fail to start up. So I'm including that bugfix in this branch.

Todo:

- [x] Fix fork choice tests
- [x] Self review
- [x] Add fix for ethereum/consensus-specs#2727
- [x] Rebase onto Kintusgi 
- [x] Fix `num_active_validators` calculation as @michaelsproul pointed out
- [x] Clean up db migrations

Co-authored-by: realbigsean <[email protected]>
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@bors
Copy link

bors bot commented Dec 13, 2021

@bors bors bot changed the title v1.1.6 Fork Choice changes [Merged by Bors] - v1.1.6 Fork Choice changes Dec 13, 2021
@bors bors bot closed this Dec 13, 2021
bors bot pushed a commit that referenced this pull request Dec 13, 2022
## Proposed Changes

With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks.

This PR adds three flags to the BN to control this behaviour:

* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.

For safety Lighthouse will only attempt a re-org under very specific conditions:

1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.


## Additional Info

For the initial idea and background, see: ethereum/consensus-specs#2353 (comment)

There is also a specification for this feature here: ethereum/consensus-specs#3034

Co-authored-by: Michael Sproul <[email protected]>
Co-authored-by: pawan <[email protected]>
bors bot pushed a commit that referenced this pull request Dec 13, 2022
## Proposed Changes

With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks.

This PR adds three flags to the BN to control this behaviour:

* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.

For safety Lighthouse will only attempt a re-org under very specific conditions:

1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.


## Additional Info

For the initial idea and background, see: ethereum/consensus-specs#2353 (comment)

There is also a specification for this feature here: ethereum/consensus-specs#3034

Co-authored-by: Michael Sproul <[email protected]>
Co-authored-by: pawan <[email protected]>
bors bot pushed a commit that referenced this pull request Dec 13, 2022
## Proposed Changes

With proposer boosting implemented (#2822) we have an opportunity to re-org out late blocks.

This PR adds three flags to the BN to control this behaviour:

* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.

For safety Lighthouse will only attempt a re-org under very specific conditions:

1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.


## Additional Info

For the initial idea and background, see: ethereum/consensus-specs#2353 (comment)

There is also a specification for this feature here: ethereum/consensus-specs#3034

Co-authored-by: Michael Sproul <[email protected]>
Co-authored-by: pawan <[email protected]>
macladson pushed a commit to macladson/lighthouse that referenced this pull request Jan 5, 2023
## Proposed Changes

With proposer boosting implemented (sigp#2822) we have an opportunity to re-org out late blocks.

This PR adds three flags to the BN to control this behaviour:

* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.

For safety Lighthouse will only attempt a re-org under very specific conditions:

1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.


## Additional Info

For the initial idea and background, see: ethereum/consensus-specs#2353 (comment)

There is also a specification for this feature here: ethereum/consensus-specs#3034

Co-authored-by: Michael Sproul <[email protected]>
Co-authored-by: pawan <[email protected]>
bors bot pushed a commit that referenced this pull request Mar 21, 2023
## Issue Addressed

NA

## Proposed Changes

- Implements ethereum/consensus-specs#3290
- Bumps `ef-tests` to [v1.3.0-rc.4](https://github.com/ethereum/consensus-spec-tests/releases/tag/v1.3.0-rc.4).

The `CountRealizedFull` concept has been removed and the `--count-unrealized-full` and `--count-unrealized` BN flags now do nothing but log a `WARN` when used.

## Database Migration Debt

This PR removes the `best_justified_checkpoint` from fork choice. This field is persisted on-disk and the correct way to go about this would be to make a DB migration to remove the field. However, in this PR I've simply stubbed out the value with a junk value. I've taken this approach because if we're going to do a DB migration I'd love to remove the `Option`s around the justified and finalized checkpoints on `ProtoNode` whilst we're at it. Those options were added in #2822 which was included in Lighthouse v2.1.0. The options were only put there to handle the migration and they've been set to `Some` ever since v2.1.0. There's no reason to keep them as options anymore.

I started adding the DB migration to this branch but I started to feel like I was bloating this rather critical PR with nice-to-haves. I've kept the partially-complete migration [over in my repo](https://github.com/paulhauner/lighthouse/tree/fc-pr-18-migration) so we can pick it up after this PR is merged.
@realbigsean realbigsean deleted the justified-checkpoint-root branch November 21, 2023 16:15
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
With proposer boosting implemented (sigp#2822) we have an opportunity to re-org out late blocks.

This PR adds three flags to the BN to control this behaviour:

* `--disable-proposer-reorgs`: turn aggressive re-orging off (it's on by default).
* `--proposer-reorg-threshold N`: attempt to orphan blocks with less than N% of the committee vote. If this parameter isn't set then N defaults to 20% when the feature is enabled.
* `--proposer-reorg-epochs-since-finalization N`: only attempt to re-org late blocks when the number of epochs since finalization is less than or equal to N. The default is 2 epochs, meaning re-orgs will only be attempted when the chain is finalizing optimally.

For safety Lighthouse will only attempt a re-org under very specific conditions:

1. The block being proposed is 1 slot after the canonical head, and the canonical head is 1 slot after its parent. i.e. at slot `n + 1` rather than building on the block from slot `n` we build on the block from slot `n - 1`.
2. The current canonical head received less than N% of the committee vote. N should be set depending on the proposer boost fraction itself, the fraction of the network that is believed to be applying it, and the size of the largest entity that could be hoarding votes.
3. The current canonical head arrived after the attestation deadline from our perspective. This condition was only added to support suppression of forkchoiceUpdated messages, but makes intuitive sense.
4. The block is being proposed in the first 2 seconds of the slot. This gives it time to propagate and receive the proposer boost.

For the initial idea and background, see: ethereum/consensus-specs#2353 (comment)

There is also a specification for this feature here: ethereum/consensus-specs#3034

Co-authored-by: Michael Sproul <[email protected]>
Co-authored-by: pawan <[email protected]>
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
NA

- Implements ethereum/consensus-specs#3290
- Bumps `ef-tests` to [v1.3.0-rc.4](https://github.com/ethereum/consensus-spec-tests/releases/tag/v1.3.0-rc.4).

The `CountRealizedFull` concept has been removed and the `--count-unrealized-full` and `--count-unrealized` BN flags now do nothing but log a `WARN` when used.

This PR removes the `best_justified_checkpoint` from fork choice. This field is persisted on-disk and the correct way to go about this would be to make a DB migration to remove the field. However, in this PR I've simply stubbed out the value with a junk value. I've taken this approach because if we're going to do a DB migration I'd love to remove the `Option`s around the justified and finalized checkpoints on `ProtoNode` whilst we're at it. Those options were added in sigp#2822 which was included in Lighthouse v2.1.0. The options were only put there to handle the migration and they've been set to `Some` ever since v2.1.0. There's no reason to keep them as options anymore.

I started adding the DB migration to this branch but I started to feel like I was bloating this rather critical PR with nice-to-haves. I've kept the partially-complete migration [over in my repo](https://github.com/paulhauner/lighthouse/tree/fc-pr-18-migration) so we can pick it up after this PR is merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-merge This PR is ready to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants