Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement graceful shutdown #2456

Merged
merged 25 commits into from
Dec 27, 2024
Merged

Implement graceful shutdown #2456

merged 25 commits into from
Dec 27, 2024

Conversation

CHr15F0x
Copy link
Member

@CHr15F0x CHr15F0x commented Dec 23, 2024

Fixes: #2417

  • tokio::task-s and tokio blocking tasks (spawn_blocking) are now spawned and tracked via a task tracker, using the util::task::spawn[_blocking] api,
  • cancellation is triggered via cancellation token,
  • the token is also passed to spawn_blocking, and the caller must use it to bail out in case of a longer running task,
  • std::thread-s should now be spawned with util::task::spawn_std, because it gives the caller access to the token too and the caller should then use it to break out of a long running loop,
  • after cancellation is triggered, util::task::spawn[_blocking] tasks will be waited upon, while util::task::spawn_std will not be joined, unless the caller locally does so (we don't use many and I don't think their detachment is a big issue),
  • INT and TERM signal listeners are installed after the db is migrated and tries are pruned to allow the user to cancel that process instead of forcing them to wait,
  • from then on the process does not exit, even when an error is encountered, which is handled in the final select - this is to ensure orderly cancellation even in case of startup errors

Apart from implementing the above ⬆️ , adds a refactor plus some fixes to bugs noticed on the way, as per commit order:

  • add a util crate for general Rust utilities,
  • use rayon where spawn_blocking is sub-optimal,
  • move make_stream and AnyhowExt to util,
  • rollback_to_anchor does not commit() the db transaction,
  • remove some dead code,
  • p2p db connection pool is too small,
  • critical task errors not propagated to process return status,
  • make sure that all db connection pools, especially those that are RO are dropped before the last RW connection pool is dropped (if RO connection pool is dropped last, wal and shm files are never cleaned up).

@CHr15F0x CHr15F0x force-pushed the chris/graceful-shutdown-2 branch 2 times, most recently from 92e1548 to f380908 Compare December 23, 2024 16:10
@CHr15F0x CHr15F0x marked this pull request as ready for review December 23, 2024 16:46
@CHr15F0x CHr15F0x requested a review from a team as a code owner December 23, 2024 16:46
@CHr15F0x CHr15F0x changed the title graceful shutdown Implement graceful shutdown Dec 23, 2024
Copy link
Contributor

@t00ts t00ts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

crates/pathfinder/src/sync/checkpoint.rs Show resolved Hide resolved
crates/util/Cargo.toml Outdated Show resolved Hide resolved
task_tracker.spawn(async move {
tokio::select! {
_ = cancellation_token.cancelled() => {
F::Output::cancelled()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise the signature would have to be

pub fn spawn<F>(future: F) -> tokio::task::JoinHandle<Option<F::Output>>

to return None if gracefully cancelled.

@CHr15F0x CHr15F0x force-pushed the chris/graceful-shutdown-2 branch from 95c98e1 to 423d84a Compare December 27, 2024 10:10
@CHr15F0x CHr15F0x merged commit a9e38b2 into main Dec 27, 2024
8 checks passed
@CHr15F0x CHr15F0x deleted the chris/graceful-shutdown-2 branch December 27, 2024 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement graceful shutdown
3 participants