Prevent VC from sending expired subscriptions #5296
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Addressed
The Lighthouse BN frequently logs:
Proposed Changes
I believe the issue is due to how the new
already_sent
mechanism works. It was introduced in:When we initialise duties we don't make any attempt to avoid creating
SubscriptionSlots
entries for past slots. As a result, after a restart or a duties re-org we will try to send subscriptions for expired slots. Additionally, the VC will only setalready_sent
totrue
once the subscription attempt succeeds, meaning that it will keep trying (and get stuck) when the subscriptions for past slots are rejected by the BN.There are multiple ways of fixing this. The fix I've implemented (for now) is two-fold:
should_send_subscription_at
. This means that we should never try sending a subscription if the slot is> duty.slot - 3
, i.e. atduty.slot - 2
or later slots. This aligns with the conditions under which the BN will reject subscriptions.already_sent
totrue
after attempting to subscribe. This changes the behaviour in all failure cases, including the failure case for sending a past subscription that the BN rejects. As a result, we rely on the next scheduled subscription to trigger a retry in cases where all BNs errored for some other reason (e.g. connectivity issue). I think this is OK, as subscriptions still have a lot of redundancy built in, and the extra spamming when the BNs are erroring is probably not helpful. If we miss the last scheduled subscription slot for a duty then we won't retry (due to the first change), and this is OK because the BN would likely reject a retry attempt anyway.Additional Info
Alternatives I considered but haven't tried implementing:
current_slot
when initialisingSubscriptionSlots
and exclude duties that are in the past. IMO this is pretty much equivalent to putting the condition inshould_send_subscription_at
, and the memory savings from not storing a few slots & bools in exceptional cases (after startup, reorgs) are minimal.