[Merged by Bors] - Consensus context with proposer index caching #3604

michaelsproul · 2022-09-23T08:36:33Z

Issue Addressed

Proposed Changes

Backport some changes from tree-states that remove duplicated calculations of the proposer_index.

With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required.

Additional Info

In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards.

There are some other changes to remove indexing in tests that were also useful for tree-states (the tree-states types don't implement Index).

michaelsproul · 2022-09-23T08:38:03Z

This is complementary to #3263, but would likely conflict slightly syntactically.

I'll also post some benchmark numbers soon (TM)

beacon_node/beacon_chain/src/beacon_chain.rs

michaelsproul · 2022-09-24T22:50:39Z

Benchmarks indicate a 7.5-20% reduction in run time for block processing for a recent block (slot 4769389).

The reduction was greatest on my Apple M1 Pro, from a median time of 38.2ms to 30.9ms. On an AMD Ryzen 5950X the reduction was from 40.8ms to 34.7ms.

Full details in this spreadsheet: https://docs.google.com/spreadsheets/d/1axT3Tda3wzWXdUzPEt5rAmxWE7PdIkSta7sW5PHfdq0/edit

paulhauner

Very nice, I like the ConsensusContext! I notice you didn't go with ctx 😉 I have a couple of comments, but they're not strictly necessary!

I had a lil' grep and noticed two places where we're still computing the proposer index, perhaps you're aware of them:

consensus/state_processing/src/common/slash_validator.rs: let proposer_index = state.get_beacon_proposer_index(state.slot(), spec)?;
consensus/state_processing/src/per_block_processing/process_operations.rs: let proposer_index = state.get_beacon_proposer_index(state.slot(), spec)? as u64;
- This is phase0, so there are marginal returns from this. Perhaps useful during sync, for what that's worth in a WSS world. I could take or leave this.

Additionally, I believe this call is totally redundant and could be removed (it could also happen in another PR, potential scope creep):

lighthouse/beacon_node/beacon_chain/src/beacon_chain.rs

Line 3501 in 01e84b7

    
           let proposer_index = state.get_beacon_proposer_index(state.slot(), &self.spec)? as u64;

## Proposed Changes Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time. Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit The numbers in that spreadsheet compare the `consensus-context` branch from #3604 to the same branch compiled with the `maxperf` profile using: ``` PROFILE=maxperf make install-lcli ``` ## Additional Info The downsides of the maxperf profile are: - It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s. As a result I think we should not enable this everywhere by default. - **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`. - **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions. Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.

michaelsproul · 2022-09-29T00:09:26Z

Thanks for the review Paul!

I've addressed all your comments in these two commits:

Removing the re-calc in block production: e1e1e89
Removing the re-calc in block processing: bb7e88e

Figured we may as well try to be fast for block replay and in case of mass slashings! Nice catch

## Proposed Changes Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time. Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit The numbers in that spreadsheet compare the `consensus-context` branch from #3604 to the same branch compiled with the `maxperf` profile using: ``` PROFILE=maxperf make install-lcli ``` ## Additional Info The downsides of the maxperf profile are: - It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s. As a result I think we should not enable this everywhere by default. - **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`. - **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions. Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.

michaelsproul · 2022-10-04T19:31:55Z

Pushed one (1) more sneaky commit with an optimisation for block replay: 26b819f. I was going to put it in a follow-up but figured I'd streamline things by including it here.

paulhauner

Perfect, sorry for the review delay!

michaelsproul · 2022-10-15T22:25:33Z

Thank you for the review!

bors r+

## Issue Addressed Closes #2371 ## Proposed Changes Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`. With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required. ## Additional Info In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards. There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).

bors · 2022-10-16T00:51:19Z

Pull request successfully merged into unstable.

Build succeeded:

## Issue Addressed Closes sigp#2371 ## Proposed Changes Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`. With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required. ## Additional Info In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards. There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).

## Proposed Changes Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time. Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit The numbers in that spreadsheet compare the `consensus-context` branch from sigp#3604 to the same branch compiled with the `maxperf` profile using: ``` PROFILE=maxperf make install-lcli ``` ## Additional Info The downsides of the maxperf profile are: - It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s. As a result I think we should not enable this everywhere by default. - **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`. - **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions. Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.

Closes sigp#2371 Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`. With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required. In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards. There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).

michaelsproul added 2 commits September 23, 2022 18:11

ConsensusContext ported to unstable

d0a8431

More proposer index plumbing

ff6221c

michaelsproul added ready-for-review The code is ready for review optimization Something to make Lighthouse run more efficiently. labels Sep 23, 2022

michaelsproul commented Sep 23, 2022

View reviewed changes

beacon_node/beacon_chain/src/beacon_chain.rs Outdated Show resolved Hide resolved

Block production checks, tweak lcli

87fdd71

Fix some beacon chain tests

f7f55f9

michaelsproul mentioned this pull request Sep 26, 2022

[Merged by Bors] - Add maxperf build profile #3608

Closed

michaelsproul added the v3.2.0 Minor release following v3.1.2 label Sep 26, 2022

paulhauner approved these changes Sep 28, 2022

View reviewed changes

paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Sep 28, 2022

Remove redundant proposer index calc in block production

e1e1e89

Remove remaining proposer index recalculations

bb7e88e

michaelsproul added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Sep 29, 2022

michaelsproul requested a review from paulhauner September 29, 2022 00:09

paulhauner self-assigned this Oct 4, 2022

Omit proposer recalculation from block replay

26b819f

paulhauner approved these changes Oct 15, 2022

View reviewed changes

paulhauner removed the ready-for-review The code is ready for review label Oct 15, 2022

paulhauner added ready-for-review The code is ready for review ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Oct 15, 2022

bors bot changed the title ~~Consensus context with proposer index caching~~ [Merged by Bors] - Consensus context with proposer index caching Oct 16, 2022

bors bot closed this Oct 16, 2022

michaelsproul deleted the consensus-context branch October 16, 2022 01:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Merged by Bors] - Consensus context with proposer index caching #3604

[Merged by Bors] - Consensus context with proposer index caching #3604

michaelsproul commented Sep 23, 2022

michaelsproul commented Sep 23, 2022

michaelsproul commented Sep 24, 2022 •

edited

Loading

paulhauner left a comment

michaelsproul commented Sep 29, 2022

michaelsproul commented Oct 4, 2022

paulhauner left a comment

michaelsproul commented Oct 15, 2022

bors bot commented Oct 16, 2022

[Merged by Bors] - Consensus context with proposer index caching #3604

[Merged by Bors] - Consensus context with proposer index caching #3604

Conversation

michaelsproul commented Sep 23, 2022

Issue Addressed

Proposed Changes

Additional Info

michaelsproul commented Sep 23, 2022

michaelsproul commented Sep 24, 2022 • edited Loading

paulhauner left a comment

Choose a reason for hiding this comment

michaelsproul commented Sep 29, 2022

michaelsproul commented Oct 4, 2022

paulhauner left a comment

Choose a reason for hiding this comment

michaelsproul commented Oct 15, 2022

bors bot commented Oct 16, 2022

michaelsproul commented Sep 24, 2022 •

edited

Loading