Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Consensus context with proposer index caching #3604

Closed
wants to merge 7 commits into from

Conversation

michaelsproul
Copy link
Member

Issue Addressed

Closes #2371

Proposed Changes

Backport some changes from tree-states that remove duplicated calculations of the proposer_index.

With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required.

Additional Info

In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards.

There are some other changes to remove indexing in tests that were also useful for tree-states (the tree-states types don't implement Index).

@michaelsproul michaelsproul added ready-for-review The code is ready for review optimization Something to make Lighthouse run more efficiently. labels Sep 23, 2022
@michaelsproul
Copy link
Member Author

This is complementary to #3263, but would likely conflict slightly syntactically.

I'll also post some benchmark numbers soon (TM)

@michaelsproul
Copy link
Member Author

michaelsproul commented Sep 24, 2022

Benchmarks indicate a 7.5-20% reduction in run time for block processing for a recent block (slot 4769389).

The reduction was greatest on my Apple M1 Pro, from a median time of 38.2ms to 30.9ms. On an AMD Ryzen 5950X the reduction was from 40.8ms to 34.7ms.

Full details in this spreadsheet: https://docs.google.com/spreadsheets/d/1axT3Tda3wzWXdUzPEt5rAmxWE7PdIkSta7sW5PHfdq0/edit

@michaelsproul michaelsproul added the v3.2.0 Minor release following v3.1.2 label Sep 26, 2022
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, I like the ConsensusContext! I notice you didn't go with ctx 😉 I have a couple of comments, but they're not strictly necessary!

I had a lil' grep and noticed two places where we're still computing the proposer index, perhaps you're aware of them:

  • consensus/state_processing/src/common/slash_validator.rs: let proposer_index = state.get_beacon_proposer_index(state.slot(), spec)?;
  • consensus/state_processing/src/per_block_processing/process_operations.rs: let proposer_index = state.get_beacon_proposer_index(state.slot(), spec)? as u64;
    • This is phase0, so there are marginal returns from this. Perhaps useful during sync, for what that's worth in a WSS world. I could take or leave this.

Additionally, I believe this call is totally redundant and could be removed (it could also happen in another PR, potential scope creep):

let proposer_index = state.get_beacon_proposer_index(state.slot(), &self.spec)? as u64;

@paulhauner paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Sep 28, 2022
bors bot pushed a commit that referenced this pull request Sep 28, 2022
## Proposed Changes

Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time.

Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit

The numbers in that spreadsheet compare the `consensus-context` branch from #3604 to the same branch compiled with the `maxperf` profile using:

```
PROFILE=maxperf make install-lcli
```

## Additional Info

The downsides of the maxperf profile are:

- It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s.

As a result I think we should not enable this everywhere by default.

- **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`.
- **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. 

I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions.

Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.
@michaelsproul
Copy link
Member Author

Thanks for the review Paul!

I've addressed all your comments in these two commits:

  • Removing the re-calc in block production: e1e1e89
  • Removing the re-calc in block processing: bb7e88e

Figured we may as well try to be fast for block replay and in case of mass slashings! Nice catch

@michaelsproul michaelsproul added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Sep 29, 2022
bors bot pushed a commit that referenced this pull request Sep 29, 2022
## Proposed Changes

Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time.

Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit

The numbers in that spreadsheet compare the `consensus-context` branch from #3604 to the same branch compiled with the `maxperf` profile using:

```
PROFILE=maxperf make install-lcli
```

## Additional Info

The downsides of the maxperf profile are:

- It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s.

As a result I think we should not enable this everywhere by default.

- **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`.
- **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. 

I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions.

Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.
bors bot pushed a commit that referenced this pull request Sep 29, 2022
## Proposed Changes

Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time.

Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit

The numbers in that spreadsheet compare the `consensus-context` branch from #3604 to the same branch compiled with the `maxperf` profile using:

```
PROFILE=maxperf make install-lcli
```

## Additional Info

The downsides of the maxperf profile are:

- It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s.

As a result I think we should not enable this everywhere by default.

- **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`.
- **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. 

I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions.

Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.
bors bot pushed a commit that referenced this pull request Sep 29, 2022
## Proposed Changes

Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time.

Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit

The numbers in that spreadsheet compare the `consensus-context` branch from #3604 to the same branch compiled with the `maxperf` profile using:

```
PROFILE=maxperf make install-lcli
```

## Additional Info

The downsides of the maxperf profile are:

- It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s.

As a result I think we should not enable this everywhere by default.

- **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`.
- **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. 

I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions.

Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.
bors bot pushed a commit that referenced this pull request Sep 29, 2022
## Proposed Changes

Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time.

Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit

The numbers in that spreadsheet compare the `consensus-context` branch from #3604 to the same branch compiled with the `maxperf` profile using:

```
PROFILE=maxperf make install-lcli
```

## Additional Info

The downsides of the maxperf profile are:

- It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s.

As a result I think we should not enable this everywhere by default.

- **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`.
- **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. 

I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions.

Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.
@paulhauner paulhauner self-assigned this Oct 4, 2022
@michaelsproul
Copy link
Member Author

Pushed one (1) more sneaky commit with an optimisation for block replay: 26b819f. I was going to put it in a follow-up but figured I'd streamline things by including it here.

Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, sorry for the review delay!

@paulhauner paulhauner removed the ready-for-review The code is ready for review label Oct 15, 2022
@paulhauner paulhauner added ready-for-review The code is ready for review ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Oct 15, 2022
@michaelsproul
Copy link
Member Author

Thank you for the review!

bors r+

bors bot pushed a commit that referenced this pull request Oct 15, 2022
## Issue Addressed

Closes #2371

## Proposed Changes

Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`.

With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required.

## Additional Info

In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards.

There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).
@bors bors bot changed the title Consensus context with proposer index caching [Merged by Bors] - Consensus context with proposer index caching Oct 16, 2022
@bors bors bot closed this Oct 16, 2022
@michaelsproul michaelsproul deleted the consensus-context branch October 16, 2022 01:05
macladson pushed a commit to macladson/lighthouse that referenced this pull request Jan 5, 2023
## Issue Addressed

Closes sigp#2371

## Proposed Changes

Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`.

With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required.

## Additional Info

In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards.

There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
## Proposed Changes

Add a new Cargo compilation profile called `maxperf` which enables more aggressive compiler optimisations at the expense of compilation time.

Some rough initial benchmarks show that this can provide up to a 25% reduction to run time for CPU bound tasks like block processing: https://docs.google.com/spreadsheets/d/15jHuZe7lLHhZq9Nw8kc6EL0Qh_N_YAYqkW2NQ_Afmtk/edit

The numbers in that spreadsheet compare the `consensus-context` branch from sigp#3604 to the same branch compiled with the `maxperf` profile using:

```
PROFILE=maxperf make install-lcli
```

## Additional Info

The downsides of the maxperf profile are:

- It increases compile times substantially, which will particularly impact low-spec hardware. Compiling `lcli` is about 3x slower. Compiling Lighthouse is about 5x slower on my 5950X: 17m 38s rather than 3m 28s.

As a result I think we should not enable this everywhere by default.

- **Option 1**: enable by default for our released binaries. This gives the majority of users the fastest version of `lighthouse` possible, at the expense of slowing down our release CI. Source builds will continue to use the default `release` profile unless users opt-in to `maxperf`.
- **Option 2**: enable by default for source builds. This gives users building from source an edge, but makes them pay for it with compilation time. 

I think I would prefer Option 1. I'll try doing some benchmarking to see how long a maxperf build of Lighthouse would take on GitHub actions.

Credit to Nicholas Nethercote for documenting these options in the Rust Performance Book: https://nnethercote.github.io/perf-book/build-configuration.html.
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
Closes sigp#2371

Backport some changes from `tree-states` that remove duplicated calculations of the `proposer_index`.

With this change the proposer index should be calculated only once for each block, and then plumbed through to every place it is required.

In future I hope to add more data to the consensus context that is cached on a per-epoch basis, like the effective balances of validators and the base rewards.

There are some other changes to remove indexing in tests that were also useful for `tree-states` (the `tree-states` types don't implement `Index`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Something to make Lighthouse run more efficiently. ready-for-merge This PR is ready to merge. v3.2.0 Minor release following v3.1.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants