Fix edge case in finalizer policy changes #292

arhag · 2024-06-14T20:37:28Z

Under the current rules for changing finalizer policies, there is an edge case which can lead to a finality violation despite none of the finalizers breaking the rules.

This edge case is partially enabled by a a block producer getting a weak_achieved QC and using it in the block claim, while another block producer that produces a competing fork makes a claim on the same block using a strong QC. (Note it is possible for both a weak_achieved and strong QC to exist for the same block without any finalizers misbehaving because it is possible for just one strong vote that was missing from the weak_achieved QC to add to it and make it into a strong QC.)

Enabling this edge case also requires a particular finalizer policy change to occur, one involving two finalizer policies to both be pending, around the same time as this fork. The current finalizer policy can be denoted A. The next two pending finalizer policies are denoted B and C. In the branch in which the strong QC claim was made, B is immediately promoted to the active finalizer policy, and then B votes for enough blocks (first weak, then strong) to advance finality to the first block of that fork past the fork point. At the same time, in the branch in which the weak QC claim was made, B is never promoted to the active finalizer policy. Instead a couple of blocks in which A is the finalizer policy are produced and voted on, which then allows the block after that to promote C to the active finalizer policy. Then C can vote on enough new blocks after that to advance finality to a block in that branch which has C as an active finalizer policy.

This creates a finality violation despite no finalizer breaking the rules (assuming, for example, the there is no intersection between the set of finalizers from any two of A, B, and C).

finality_violation_edge_case.pdf

Add a test case for the edge case above that would demonstrate the finality violation if it was not for the changes below.

To prevent a finality violation from occurring in this edge case (assuming the necessary number of finalizers do not misbehave), the rules for promoting a proposed finalizer policy to pending need to be adjusted.

When LIB changes, we still need to evaluate possible promotions from pending to active (removing any pending policies that are known at that time to never become active) and from proposed to pending (removing any proposed policies that are known at that time to never become pending), in that order. However, now we would not promote a proposed policy to pending if it would allow more than one pending policies to exist.

This means, on LIB change:

If there is a pending policy, determine whether it should be promoted to active. If the associated block number is less than or equal to the new LIB number, the pending policy should be promoted to active. This also means the pending slot is now open for a possible promotion of a proposed policy to pending. This is essentially the same behavior as before, except that we now have a guarantee that there will be at most one pending policy at any given time. That means there wouldn't be additional pending policies to garbage collection at the time of promotion. It also means that the state does not need to use a vector to store the pending policy.
If there is any proposed policy with an associated block number that is less than or equal to the new LIB number:
i. Find the proposed policy with the greatest associated block number that is still less than or equal to the new LIB number (call this the target proposed policy).
ii. Remove any proposed policies with an associated block number less than that of the target proposed policy.
iii. If there is no pending policy, promote that target proposed policy to pending. Otherwise, leave the target proposed policy (and any other proposed policies with greater associated block numbers) alone in the proposed policy queue.

With these changes, it is still possible for a proposed policy to never become active. However, a policy that becomes pending is guaranteed to eventually become active. Additionally, we can guaranteed that the policy will have at least 3 blocks in which it is the active finalizer policy. Also, there can be at most one pending finalizer policy at any given time.

This means that the pending policies should be taken out of the block_header_state::finalizer_policies flat_multimap and instead captured as a separate std::optional<std::pair<block_num_type, finalizer_policy_ptr>> pending_finalizer_policy field.

Additionally, because finalizer_policies would only track proposed finalizer policies, it can be redefined to:

std::vector<std::pair<block_num_type, finalizer_policy_ptr>> proposed_finalizer_policies;

And finalizer_policy_tracker can be removed.

Even with the changes above, the test case demonstrating the edge case should still be included. However, it would now demonstrate that on either branch of the fork the active finalizer policy changes from A to B, which, assuming enough of the finalizers in B are honest, would not lead to a finality violation.

The text was updated successfully, but these errors were encountered:

arhag added this to the Savanna: Production-Ready milestone Jun 14, 2024

enf-ci-bot added the triage label Jun 14, 2024

enf-ci-bot added this to Team Backlog Jun 14, 2024

enf-ci-bot moved this to Todo in Team Backlog Jun 14, 2024

arhag added bug The product is not working as was intended. consensus-protocol Change to the consensus protocol. Impacts light client validation. 👍 lgtm and removed triage labels Jun 14, 2024

arhag assigned linh2931 Jun 14, 2024

arhag mentioned this issue Jun 17, 2024

Only advance finalizer policy generation and provide diff in block header on changes to pending finalizer policy #306

Closed

linh2931 mentioned this issue Jun 18, 2024

New implementation of proposed and pending finalizer policies evaluation to fix edge cases in finalizer policy changes #307

Merged

linh2931 moved this from Todo to Awaiting Review in Team Backlog Jun 18, 2024

linh2931 closed this as completed in #307 Jun 25, 2024

github-project-automation bot moved this from Awaiting Review to Done in Team Backlog Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix edge case in finalizer policy changes #292

Fix edge case in finalizer policy changes #292

arhag commented Jun 14, 2024 •

edited

Loading

Fix edge case in finalizer policy changes #292

Fix edge case in finalizer policy changes #292

Comments

arhag commented Jun 14, 2024 • edited Loading

arhag commented Jun 14, 2024 •

edited

Loading