Fix edge case in finalizer policy changes #292
Labels
bug
The product is not working as was intended.
consensus-protocol
Change to the consensus protocol. Impacts light client validation.
👍 lgtm
Milestone
Under the current rules for changing finalizer policies, there is an edge case which can lead to a finality violation despite none of the finalizers breaking the rules.
This edge case is partially enabled by a a block producer getting a
weak_achieved
QC and using it in the block claim, while another block producer that produces a competing fork makes a claim on the same block using astrong
QC. (Note it is possible for both aweak_achieved
andstrong
QC to exist for the same block without any finalizers misbehaving because it is possible for just one strong vote that was missing from theweak_achieved
QC to add to it and make it into astrong
QC.)Enabling this edge case also requires a particular finalizer policy change to occur, one involving two finalizer policies to both be pending, around the same time as this fork. The current finalizer policy can be denoted A. The next two pending finalizer policies are denoted B and C. In the branch in which the strong QC claim was made, B is immediately promoted to the active finalizer policy, and then B votes for enough blocks (first weak, then strong) to advance finality to the first block of that fork past the fork point. At the same time, in the branch in which the weak QC claim was made, B is never promoted to the active finalizer policy. Instead a couple of blocks in which A is the finalizer policy are produced and voted on, which then allows the block after that to promote C to the active finalizer policy. Then C can vote on enough new blocks after that to advance finality to a block in that branch which has C as an active finalizer policy.
This creates a finality violation despite no finalizer breaking the rules (assuming, for example, the there is no intersection between the set of finalizers from any two of A, B, and C).
finality_violation_edge_case.pdf
Add a test case for the edge case above that would demonstrate the finality violation if it was not for the changes below.
To prevent a finality violation from occurring in this edge case (assuming the necessary number of finalizers do not misbehave), the rules for promoting a proposed finalizer policy to pending need to be adjusted.
When LIB changes, we still need to evaluate possible promotions from pending to active (removing any pending policies that are known at that time to never become active) and from proposed to pending (removing any proposed policies that are known at that time to never become pending), in that order. However, now we would not promote a proposed policy to pending if it would allow more than one pending policies to exist.
This means, on LIB change:
i. Find the proposed policy with the greatest associated block number that is still less than or equal to the new LIB number (call this the target proposed policy).
ii. Remove any proposed policies with an associated block number less than that of the target proposed policy.
iii. If there is no pending policy, promote that target proposed policy to pending. Otherwise, leave the target proposed policy (and any other proposed policies with greater associated block numbers) alone in the proposed policy queue.
With these changes, it is still possible for a proposed policy to never become active. However, a policy that becomes pending is guaranteed to eventually become active. Additionally, we can guaranteed that the policy will have at least 3 blocks in which it is the active finalizer policy. Also, there can be at most one pending finalizer policy at any given time.
This means that the pending policies should be taken out of the
block_header_state::finalizer_policies
flat_multimap
and instead captured as a separatestd::optional<std::pair<block_num_type, finalizer_policy_ptr>> pending_finalizer_policy
field.Additionally, because
finalizer_policies
would only track proposed finalizer policies, it can be redefined to:And
finalizer_policy_tracker
can be removed.Even with the changes above, the test case demonstrating the edge case should still be included. However, it would now demonstrate that on either branch of the fork the active finalizer policy changes from A to B, which, assuming enough of the finalizers in B are honest, would not lead to a finality violation.
The text was updated successfully, but these errors were encountered: