Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.2 -> 4.0] SHiP flush logs on write #934

Closed
wants to merge 5 commits into from

Conversation

heifner
Copy link
Member

@heifner heifner commented Mar 31, 2023

Flush the state_history_plugin logs to disk at the end of each write to make it more likely that a kill -9 or crash leaves valid logs.

Merges #928 into release/4.0.

Includes a fix for replay where forkdb.root() is null during the write in SHiP for on_accepted_block. 4.0 SHiP calls chain.last_irreversible_block_id() in on_accepted_block which expects forkdb.root() to be valid.

Resolves #596

@heifner heifner changed the base branch from main to release/4.0 March 31, 2023 11:45
@heifner heifner requested review from linh2931 and spoonincode March 31, 2023 11:46
@heifner heifner added the OCI Work exclusive to OCI team label Mar 31, 2023
fork_db.reset( *head );
return;
if (!blog_head)
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.2

if( !blog.head() && !fork_db.root() ) {
does not have this. Should this be done first in 3.2 too? If it is not applied to 3.2, it is better to be done in a separate PR for tracking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is likely a good change for 3.2 as well, but I hit the problem in 4.0 because 4.0 in on_accepted_block calls chain.last_irreversible_block_id(). 3.2 does not call that method and didn't hit this issue.

Happy, to pull this in via a separate PR if desired, but it has to go in otherwise ship aborts when loading from a snapshot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the PR title is [3.2 -> 4.0], it might be better to only contain 3.2 changes. A separate PR is an extra work. Either way is fine with me.

@heifner
Copy link
Member Author

heifner commented Mar 31, 2023

Replaced by #935 without the forkdb root change.

@heifner heifner closed this Mar 31, 2023
@heifner heifner deleted the GH-596-ship-flush-4.0 branch May 18, 2023 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCI Work exclusive to OCI team
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3.2: when ship node exits with error it usually doesn't start up again properly with snapshot
2 participants