Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QD Reconfig] 3. add reconfig observer #7024

Merged
merged 5 commits into from
Jan 6, 2023
Merged

Conversation

longbowlu
Copy link
Collaborator

@longbowlu longbowlu commented Dec 25, 2022

  1. This PR adds ReconfigObserver Trait. RO detects reconfigs and updates quorum driver the new committee.
  2. OnsiteReconfigObserver is the RO that lives in Fullnode/TransactionOrchestrator that subscribes to the checkpoint executor reconfig channel. Note the integration of OnsiteReconfigObserver and TransactionOrchestrator happens in the follow-up PR

@vercel
Copy link

vercel bot commented Dec 25, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
explorer 🔄 Building (Inspect) Jan 6, 2023 at 0:57AM (UTC)
2 Ignored Deployments
Name Status Preview Comments Updated
explorer-storybook ⬜️ Ignored (Inspect) Jan 6, 2023 at 0:57AM (UTC)
wallet-adapter ⬜️ Ignored (Inspect) Jan 6, 2023 at 0:57AM (UTC)

@longbowlu longbowlu marked this pull request as ready for review December 27, 2022 18:43
Copy link
Contributor

@mystenmark mystenmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will review after rebasing and existing comments are addressed! (please re-request review from me when you're ready)

warn!("Ignored non-newer from reconfig channel: {}", committee);
}
}
// Neither closed channel nor lagged shall happen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just handle the lagged state? it should be recoverable. its not clear to me why it can't possibly happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly for closed channel. Is there a reason for this assumption? Probably best to handle it in case we change the way executor works. As an example, we were toying with having executor restart at epoch end to make strong consistency of epoch store easier. This would create problems for you if you assume that this channel cannot close from underneath you

wait_for_nodes_transition_to_epoch(authorities.iter().chain(fullnodes.iter()), 1).await;

// Give it some time for the update to happen
tokio::time::sleep(tokio::time::Duration::from_secs(3)).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't wait_for_nodes_transition_to_epoch make the sleep unnecessary?

Copy link
Collaborator Author

@longbowlu longbowlu Jan 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_for_nodes_transition_to_epoch blocks until it gets a reconfig message from the channel. But it may take a bit more time for the reconfig observer to get the reconfig message itself and create/swap the auth-agg

Copy link
Contributor

@mystenmark mystenmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few comments, overall looks good

Copy link
Contributor

@williampsmith williampsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from the last comment on handling the closed channel. Although it sounds like we may be moving the sender to SuiNode, in which case the channel should never close.

@longbowlu longbowlu force-pushed the add-reconfig-observer branch from 3641a60 to 3c6f11a Compare January 5, 2023 03:05
@longbowlu longbowlu requested a review from mystenmark January 5, 2023 04:25
@longbowlu longbowlu changed the title add reconfig observer [QD Reconfig] 3. add reconfig observer Jan 5, 2023
Base automatically changed from remove-auth-active to main January 6, 2023 00:09
@longbowlu longbowlu force-pushed the add-reconfig-observer branch from f21c8b9 to 6cd59f0 Compare January 6, 2023 00:12
@longbowlu longbowlu enabled auto-merge (squash) January 6, 2023 00:12
@longbowlu longbowlu merged commit fa4b7fc into main Jan 6, 2023
@longbowlu longbowlu deleted the add-reconfig-observer branch January 6, 2023 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants