Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.1: fix: use atomic to check if leader bank changed (backport of #4596) #4612

Merged
merged 1 commit into from
Jan 25, 2025

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Jan 24, 2025

Problem

Banking stage consume workers don't check if a leader bank got interrupted while consuming work. This can cause them to get stuck consuming work for a bank that was abandoned rather than switching to the newest leader bank.

Summary of Changes

Add a lightweight atomic variable which holds the current leader bank id and check it before each consumed piece of work to see if the bank was interrupted.

Fixes #


This is an automatic backport of pull request #4596 done by [Mergify](https://mergify.com).

* fix: use atomic to check if leader bank changed

* fix default value and tests

* ordering update

* eedback

* add comment

(cherry picked from commit 4cc49ac)
@mergify mergify bot requested a review from a team as a code owner January 24, 2025 09:43
@bw-solana
Copy link

adding SME reviewers

Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, this seems fine to me.

but just to play the other side, what is the urgency to backport this? Looks like the fix only landed in master a couple days ago, have we had much test/bake time?

my understanding is we only interrupt the leader bank when some new block is replayed and new majority fork discovered (e.g. slow leader before us builds a new block), but this is somewhat rare, so I don't anticipate huge block capacity unlocks from this improvement

@willhickey willhickey merged commit 134be7c into v2.1 Jan 25, 2025
28 checks passed
@willhickey willhickey deleted the mergify/bp/v2.1/pr-4596 branch January 25, 2025 04:11
@jstarry
Copy link

jstarry commented Jan 25, 2025

@bw-solana no crazy urgency but definitely should be fixed and backported. We should make sure this bakes a bit on a staked validator before encouraging mainnet operators to adopt it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants