Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Make slashing protection import more resilient #2598

Closed

Conversation

michaelsproul
Copy link
Member

Issue Addressed

Closes #2419

Proposed Changes

Address a long-standing issue with the import of slashing protection data where the import would fail due to the data appearing slashable w.r.t the existing database. Importing is now idempotent, and will have no issues importing data that has been handed back and forth between different validator clients, or different implementations.

The implementation works by updating the high and low watermarks if they need updating, and not attempting to check if the input is slashable w.r.t itself or the database. This is a strengthening of the minification that we started to do by default since #2380, and what Teku has been doing since the beginning.

Additional Info

The only feature we lose by doing this is the ability to do non-minified imports of clock drifted messages (cf. Prysm on Medalla). In theory, with the previous implementation we could import all the messages in case of clock drift and be aware of the "gap" between the real present time and the messages signed in the far future. However for attestations this is close to useless, as the source epoch will advance as soon as justification occurs, which will require us to make slashable attestations with respect to our bogus attestation(s). E.g. if I sign an attestation 100=>200 when the current epoch is 101, then I won't be able to vote in any epochs prior to 101 becoming justified because 101=>102, 101=>103, etc are all surrounded by 100=>200. Seeing as signing attestations gets blocked almost immediately in this case regardless of our import behaviour, there's no point trying to handle it. For blocks the situation is more hopeful due to the lack of surrounds, but losing block proposals from validators who by definition can't attest doesn't seem like an issue (the other block proposers can pick up the slack).

@michaelsproul michaelsproul added ready-for-review The code is ready for review v2.0.0 Altair on mainnet release (v2.0.0) labels Sep 15, 2021
Copy link
Contributor

@winksaville winksaville left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment and a possibly incorrect observation:

  • I don't see new tests that would provide confidence that the new code is exercised/tested.

Also, I'm only providing "Comments" as I'm not qualified to Approve or Request changes :)

account_manager/src/validator/slashing_protection.rs Outdated Show resolved Hide resolved
@michaelsproul michaelsproul added work-in-progress PR is a work-in-progress and removed ready-for-review The code is ready for review labels Sep 16, 2021
@michaelsproul
Copy link
Member Author

I don't see new tests that would provide confidence that the new code is exercised/tested.

The existing test generator covers a lot of ground, particularly with allow_import_failure changed to false (meaning we have to successfully import every test case), but it never hurts to add more, so I'm working on that now.

@winksaville
Copy link
Contributor

I don't see new tests that would provide confidence that the new code is exercised/tested.

The existing test generator covers a lot of ground, particularly with allow_import_failure changed to false (meaning we have to successfully import every test case), but it never hurts to add more, so I'm working on that now.

I would like to see my scenario tested. I'll try to look at test code and see if it's covered. If not I'll try to create a test. Wish me luck, I'll need it :)

@michaelsproul
Copy link
Member Author

I would like to see my scenario tested.

I've added a test multiple_interchanges_overlapping_validators_merge_stale which matches the issue you ran into quite closely. There are three validators 0, 1, 2 and validators 0 and 1 have up-to-date information on the node doing the import (the first interchange file contains attestations with source=12, target=13 for both). Validator 2 however has stale info (attestation with source=4, target=5). We then import a file that has up to date information for validator 2, but stale information for validators 0 and 1. We then attempt to sign some messages from all the validators (with_blocks, with_attestations) to check that their knowledge of the world has been updated correctly. The allow_import_failure=false change that I mentioned before also means that the test would fail if the 2nd file did not import.

MultiTestCase::new(
"multiple_interchanges_overlapping_validators_merge_stale",
vec![
TestCase::new(interchange(vec![
(0, vec![100], vec![(12, 13)]),
(1, vec![101], vec![(12, 13)]),
(2, vec![4], vec![(4, 5)]),
])),
TestCase::new(interchange(vec![
(0, vec![2], vec![(4, 5)]),
(1, vec![3], vec![(3, 4)]),
(2, vec![102], vec![(12, 13)]),
]))
.contains_slashable_data()
.with_blocks(vec![
(0, 100, false),
(1, 101, false),
(2, 102, false),
(0, 103, true),
(1, 104, true),
(2, 105, true),
])
.with_attestations(vec![
(0, 12, 13, false),
(0, 11, 14, false),
(1, 12, 13, false),
(1, 11, 14, false),
(2, 12, 13, false),
(2, 11, 14, false),
(0, 12, 14, true),
(1, 13, 14, true),
(2, 13, 14, true),
]),
],
),

These tests get published from Lighthouse to a repo used by all clients as test vectors for EIP-3076: https://github.com/eth2-clients/slashing-protection-interchange-tests. I'll do a new release of the tests once this branch is merged.

@michaelsproul michaelsproul added ready-for-review The code is ready for review and removed work-in-progress PR is a work-in-progress labels Sep 17, 2021
@michaelsproul michaelsproul removed the v2.0.0 Altair on mainnet release (v2.0.0) label Sep 27, 2021
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one! Reads really well.

I had a very minor comment and two points:

  • It seems slightly odd to me that even though we declare minification redundant and deprecated, we'll still perform it if the flag is supplied. Perhaps this is just a defensive measure?
  • There's some duplication of minification code, but it doesn't seem outrageous to any degree. I had a look at how to unify it, and it might end up making things ultimately more complex. I just thought I'd mention it in case you had any ideas.

I'll give this an approval, since I think it's personally fine to merge as-is. Feel free to enact/ignore my comments as you see fit :)

// Summary of minimum and maximum messages pre-import.
let prev_summary = self.validator_summary(pubkey, txn)?;

// If the interchange contains a new maximum slot block, import it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very minor, but we don't necessarily import this as per the comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the relevance check? I think the "new maximum slot block" phrasing implies we only import the block if its slot is greater than all known blocks, which is checked below where it says

Block is relevant if there are no previous blocks, or new block has slot greater than
previous maximum.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to suggestions regarding wording, however

@paulhauner paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Oct 13, 2021
@michaelsproul
Copy link
Member Author

Thank you for the review! 🙏

There's some duplication of minification code, but it doesn't seem outrageous to any degree. I had a look at how to unify it, and it might end up making things ultimately more complex. I just thought I'd mention it in case you had any ideas.

I kept it intentionally separate so that the two implementations could be tested against each other, which was quite useful.

It seems slightly odd to me that even though we declare minification redundant and deprecated, we'll still perform it if the flag is supplied.

Minifying and importing should be harmless, it just wastes time in the worst case. I figured we may as well keep the functionality for flexibility. And you're right, defensively, --minify=true may be useful in some obscure case, and it's nice to have it up our sleeve seeing as all the infra already exists.

@michaelsproul michaelsproul added ready-for-merge This PR is ready to merge. and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Oct 13, 2021
@michaelsproul
Copy link
Member Author

bors r+

bors bot pushed a commit that referenced this pull request Oct 13, 2021
## Issue Addressed

Closes #2419

## Proposed Changes

Address a long-standing issue with the import of slashing protection data where the import would fail due to the data appearing slashable w.r.t the existing database. Importing is now idempotent, and will have no issues importing data that has been handed back and forth between different validator clients, or different implementations.

The implementation works by updating the high and low watermarks if they need updating, and not attempting to check if the input is slashable w.r.t itself or the database. This is a strengthening of the minification that we started to do by default since #2380, and what Teku has been doing since the beginning.

## Additional Info

The only feature we lose by doing this is the ability to do non-minified imports of clock drifted messages (cf. Prysm on Medalla). In theory, with the previous implementation we could import all the messages in case of clock drift and be aware of the "gap" between the real present time and the messages signed in the far future. _However_ for attestations this is close to useless, as the source epoch will advance as soon as justification occurs, which will require us to make slashable attestations with respect to our bogus attestation(s). E.g. if I sign an attestation 100=>200 when the current epoch is 101, then I won't be able to vote in any epochs prior to 101 becoming justified because 101=>102, 101=>103, etc are all surrounded by 100=>200. Seeing as signing attestations gets blocked almost immediately in this case regardless of our import behaviour, there's no point trying to handle it. For blocks the situation is more hopeful due to the lack of surrounds, but losing block proposals from validators who by definition can't attest doesn't seem like an issue (the other block proposers can pick up the slack).
@bors
Copy link

bors bot commented Oct 13, 2021

Build failed:

@michaelsproul
Copy link
Member Author

Looks like a spurious ganache failure

bors retry

bors bot pushed a commit that referenced this pull request Oct 13, 2021
## Issue Addressed

Closes #2419

## Proposed Changes

Address a long-standing issue with the import of slashing protection data where the import would fail due to the data appearing slashable w.r.t the existing database. Importing is now idempotent, and will have no issues importing data that has been handed back and forth between different validator clients, or different implementations.

The implementation works by updating the high and low watermarks if they need updating, and not attempting to check if the input is slashable w.r.t itself or the database. This is a strengthening of the minification that we started to do by default since #2380, and what Teku has been doing since the beginning.

## Additional Info

The only feature we lose by doing this is the ability to do non-minified imports of clock drifted messages (cf. Prysm on Medalla). In theory, with the previous implementation we could import all the messages in case of clock drift and be aware of the "gap" between the real present time and the messages signed in the far future. _However_ for attestations this is close to useless, as the source epoch will advance as soon as justification occurs, which will require us to make slashable attestations with respect to our bogus attestation(s). E.g. if I sign an attestation 100=>200 when the current epoch is 101, then I won't be able to vote in any epochs prior to 101 becoming justified because 101=>102, 101=>103, etc are all surrounded by 100=>200. Seeing as signing attestations gets blocked almost immediately in this case regardless of our import behaviour, there's no point trying to handle it. For blocks the situation is more hopeful due to the lack of surrounds, but losing block proposals from validators who by definition can't attest doesn't seem like an issue (the other block proposers can pick up the slack).
@bors
Copy link

bors bot commented Oct 13, 2021

Pull request successfully merged into unstable.

Build succeeded:

@bors bors bot changed the title Make slashing protection import more resilient [Merged by Bors] - Make slashing protection import more resilient Oct 13, 2021
@bors bors bot closed this Oct 13, 2021
@michaelsproul michaelsproul deleted the slashing-protection-import branch October 13, 2021 03:24
bors bot pushed a commit that referenced this pull request Oct 19, 2021
## Issue Addressed

Part of #2557

## Proposed Changes

Refactor the slashing protection export so that it can export data for a subset of validators.

This is the last remaining building block required for supporting the standard validator API (which I'll start to build atop this branch)

## Additional Info

Built on and requires #2598
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-merge This PR is ready to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants