You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Design a basic protocol for CN (or simulator) to initiate a stream to a BN that ensures the BN does not have any gaps in the block stream.
Initial Strawman
Block Node Connect
CN, on connect, sends block header, this contains block number.
If next block, no problem, start streaming.
If less than last known verified block, respond with "DuplicateBlock"
Response includes last known block, so CN can perhaps do its own catch up or reconnect.
This REQUIRES CN to check and resend block header, or end the stream and restart.
This includes if CN sends a block less than the last known block, that this block
node, for some reason, does not actually hold.
In this case the block node must retrieve the missing block(s) from another
block node to fill the gaps, but shall always respond to CN with the very
latest known and verified block. The streaming API is only for current data,
not for filling gaps.
If greater than next block, we missed block(s)
Respond with "Behind"
This includes last known block number.
CN will send from block after that block, or send "EndOfStream" and retry with exponential backoff.
CN will include earliest known block with end of stream, so we have an idea of the range to catch up.
This is advisory, and will almost certainly change before we finish "catching up".
If CN retries before we "catch up", we record the offered block number, and continue trying to "catch up" to that. Response is still "Behind" with last known block number.
This allows CN to jump in to "catch us up" directly if we're behind, but close enough.
We probably need a failure detection if the required target block doesn't get "closer" with each connection from CN.
If CN ends stream, need to catch up from other BN
Query BN "status" API, get last available block
If greater than or equal to block number CN sent
Ask for range, last-known-block+1 to last-available-block.
Hopefully catch up before next CN connection.
If less than block number CN sent
Either ask for stream last-known-block+1 to "infinite" and quit when caught up OR ask another BN, in case all needed blocks available elsewhere.
Each CN connect will send a block header, repeat above process until we get a matched block number or CN can finish catching us up.
Note, we can (re)enter connect any time we get a next block from CN that isn't what we expect. This simplifies logic for working out when to retry or reset a stream.
Error Handling
If CN detects an error at any time
Next BlockItem will be an EndStream item with an appropriate error code.
Block Node will drop any in-progress unproven block from that CN, and, if
no remaining active incoming streams, notify all subscribers with an EndStream item specifying "source error".
Block Node will continue streaming from other incoming stream sources, if
any, or await a restarted stream if no other incoming stream sources.
If a BN detects an error at any time
BN will send an EndStream response to all incoming streams, with appropriate
status code.
CN, on receiving the end stream, will retry publishing the stream; and will
use exponential backoff if the BN failure continues.
If CN has multiple "downstream" BN options, a CN may connect to an alternate
BN for reliability and mark the failed BN as a backup.
BN will send EndStream to all subscribers with appropriate status code.
BN will either recover or await manual recovery.
The text was updated successfully, but these errors were encountered:
jsync-swirlds
changed the title
Design hand-shake between CN and BN to determine what Block to start streaming from.
Design handshake between CN and BN to determine what Block to start streaming from.
Aug 27, 2024
This is an unlikely corner case but what's the behavior if the connect "next block" happens to be a missing block in the middle of the contiguous saved blocks? I imagine the BN needs to find missing gaps in the blocks it has stored. If the CN tries to send a block that happens to be less than the latest but is also missing, then it would respond with "DuplicateBlock" and the BN would send the latest block to start from. In a separate thread, it would backfill the missing block?
BN should not ever have an interstitial gap fillable from consensus node, if it does it should be requesting that block from another Block Node. Consensus Node should never send an old block when the Block Node has newer blocks, and the Block Node should refuse such streams with an appropriate error (likely "DuplicateBlock").
I added this specific case to the protocol above in the same duplicate block section.
Design a basic protocol for CN (or simulator) to initiate a stream to a BN that ensures the BN does not have any gaps in the block stream.
Initial Strawman
Block Node Connect
node, for some reason, does not actually hold.
block node to fill the gaps, but shall always respond to CN with the very
latest known and verified block. The streaming API is only for current data,
not for filling gaps.
Error Handling
EndStream
item with an appropriate error code.no remaining active incoming streams, notify all subscribers with an
EndStream
item specifying "source error".any, or await a restarted stream if no other incoming stream sources.
EndStream
response to all incoming streams, with appropriatestatus code.
use exponential backoff if the BN failure continues.
BN for reliability and mark the failed BN as a backup.
EndStream
to all subscribers with appropriate status code.The text was updated successfully, but these errors were encountered: