-
Notifications
You must be signed in to change notification settings - Fork 549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broadcast produced transition immediately #13551
Broadcast produced transition immediately #13551
Conversation
4cb69af
to
615d11b
Compare
!ci-build-me |
@@ -2056,9 +2056,6 @@ let create ?wallets (config : Config.t) = | |||
valid_cb | |||
| `Internal -> | |||
(*Send callback to publish the new block. Don't log rebroadcast message if it is internally generated; There is a broadcast log*) | |||
don't_wait_for | |||
(Mina_networking.broadcast_state net | |||
(Mina_block.Validated.forget transition) ) ; | |||
Option.iter | |||
~f: | |||
(Fn.flip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the broadcast blocked by this validation callback? Like whatever was causing the delay to get to here, wouldn't that still affect block broadcasts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, because there is no validation callback for internal blocks. This code does nothing. I rewrote it subtantially in Bitswap branch, but here attempted to keep the code change minimal
Problem: after block is produced, it is scheduled for broadcast only in next cycle (or even in the cycle after next). If some long async job appears in between (and this is what happens in Berkeley now), broadcast will be delayed, potentially causing block to appear later-than-needed on other nodes (especially on clusters using block window below 1 minute). Solution: broadcast produced transition immediately after it's added to frontier.
615d11b
to
2b89556
Compare
!ci-build-me |
This PR addresses two conditions related to block processing/production.
First problem is that after block is produced, it is scheduled for broadcast only in next cycle (or even in the cycle after next). If some long async job appears in between (and this is what happens in Berkeley now), broadcast will be delayed, potentially causing block to appear later-than-needed on other nodes (especially on clusters using block window below 1 minute).
Second problem appears when node goes to bootstrap while a block is being processed in
add_and_finalize
. It happens very infrequently, but shuts the node down when happens.Solution: broadcast produced transition immediately after it's added to frontier and check for pipe not being closed before writing to it.
Explain how you tested your changes:
k=12
), monitoring long async cycles and ensuring cluster's healthChecklist:
Closes #13508 with the help of #13550