-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1.0] Normally process blocks from the forkdb on startup #572
Conversation
…a node that was terminated with many blocks in the forkdb
… attempt to request the next range of block instead of ignoring the request.
…s from the fork database. This allows a block stuck in a condition where it has too many blocks in the forkdb to process new blocks to attempt to apply those blocks on startup.
// terminate-at-block, the current expectation is that the node can be restarted to examine the state at | ||
// which it was shutdown. For now, we will only process these blocks if there are peers configured. This | ||
// is a bit of a hack for Spring 1.0.0 until we can add a proper pause-at-block (issue #570) which could | ||
// be used to explicitly request a node to not process beyond a specified block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This compromise came after a long conversation with @arhag on potential alternatives.
@@ -2224,6 +2224,8 @@ namespace eosio { | |||
set_state( lib_catchup ); | |||
sync_last_requested_num = 0; | |||
sync_next_expected_num = chain_info.lib_num + 1; | |||
} else if (sync_next_expected_num >= sync_last_requested_num) { | |||
// break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change causes the net_plugin to request more blocks if possible instead of remaining in a mode where it thinks it is already syncing.
if success: | ||
for nodeId, nodeArgs in {**regularNodeosArgs, **replayNodeosArgs}.items(): | ||
assert cluster.getNode(nodeId).relaunch(), f"Unable to relaunch {nodeId}" | ||
assert cluster.getNode(nodeId).waitForLibToAdvance(), f"LIB did not advance for {nodeId}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This extra section on restarting nodes failed before the fixes in this PR.
1 : "--read-mode irreversible --terminate-at-block 100", | ||
2 : "--read-mode head --terminate-at-block 125", | ||
3 : "--read-mode speculative --terminate-at-block 150", | ||
4 : "--read-mode irreversible --terminate-at-block 180" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These block values don't really matter, I could revert to old values if desired. They were changed when I was doing some initial testing to try and reproduce the issue in this test. The changes below are what is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this PR modifies nodeos to normally process blocks out of the fork database on startup unless the node has no configured peers
Only in irreversible mode, right?
Also non-irreversible mode: |
Note:start |
In Spring 1.0.0, unlike Leap, we process blocks into the fork database immediately. This can cause the fork database to grow very large when syncing and shutdown due to hitting the new
max-reversible-blocks
limit (before #545). When a node is shutdown, in Leap, it was assumed that when restarting a node if that node did not receive any blocks from the network it would be at the same height as when it was shutdown. The existingtests/nodeos_read_terminate_at_block_test.py
integration test verifies this behavior.However, if the node shutdown because of
max-reversible-blocks
you would like on restart for the node to process blocks out of the fork database potentially allowing LIB to progress and reversible blocks to be consumed and shrink in size. This is at odds with the expected behavior of a node starting up after aterminate-at-block
. A user might find it very odd to terminate a node at a block, but find on restart that the node is actually beyond that block.To fix the issue reported in #565, we would like to process blocks out of the fork database on restart. However, we want to maintain the existing expectations of starting a node after a shutdown via
terminate-at-block
being at the same block height on restart.Therefore, this PR modifies nodeos to normally process blocks out of the fork database on startup unless the node has no configured peers. The idea being that if a user has terminated a node with
terminate-at-block
and wants the node to remain at that block, they will restart the node without anyp2p-peer-address
configured. This is a bit of a hack until #570 can be added. Since #570 is a new feature, it will not come until a future release and will not be part of Spring 1.0.0.Resolves #565