Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(node-core): log errors in debug when ETA fails to calculate #6971

Merged
merged 2 commits into from
Mar 5, 2024

Conversation

shekhirin
Copy link
Collaborator

@shekhirin shekhirin commented Mar 5, 2024

I couldn't instantly identify the source of this debug mode panic

2024-03-01T16:50:35.849816Z  INFO Executing stage pipeline_stages=2/12 stage=Bodies checkpoint=1046000 target=1046230
thread 'tokio-runtime-worker' panicked at crates/node-core/src/events/node.rs:491:18:
attempt to subtract with overflow
stack backtrace:
2024-03-01T16:50:36.219074Z  INFO Recovering senders tx_range=0..5000384
   0: rust_begin_unwind
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:144:5
   3: reth_node_core::events::node::Eta::update
             at ./crates/node-core/src/events/node.rs:491:18
   4: reth_node_core::events::node::NodeState<DB>::handle_pipeline_event
             at ./crates/node-core/src/events/node.rs:120:21
   5: <reth_node_core::events::node::EventHandler<E,DB> as core::future::future::Future>::poll
             at ./crates/node-core/src/events/node.rs:445:21
   6: reth_node_core::events::node::handle_events::{{closure}}
             at ./crates/node-core/src/events/node.rs:358:13
   7: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::future::future::Future>::poll

So instead, on subtract with overflow, we log the error on debug level and reset the ETA instead of crashing or reporting the invalid ETA.

@shekhirin shekhirin added the A-cli Related to the reth CLI label Mar 5, 2024
@shekhirin shekhirin requested a review from gakonst as a code owner March 5, 2024 16:26
@shekhirin shekhirin marked this pull request as draft March 5, 2024 16:37
@shekhirin shekhirin marked this pull request as ready for review March 5, 2024 17:02
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not fix the root cause of the issue, right?
but this prevents the overflow

@shekhirin shekhirin added this pull request to the merge queue Mar 5, 2024
Merged via the queue into main with commit 33d01d3 Mar 5, 2024
29 checks passed
@shekhirin shekhirin deleted the alexey/fallible-eta branch March 5, 2024 19:30
Comment on lines +486 to +500
let Some(processed_since_last) =
current.processed.checked_sub(self.last_checkpoint.processed)
else {
self.eta = None;
debug!(target: "reth::cli", ?current, ?self.last_checkpoint, "Failed to calculate the ETA: processed entities is less than the last checkpoint");
return
};
let elapsed = last_checkpoint_time.elapsed();
let per_second = processed_since_last as f64 / elapsed.as_secs_f64();

self.eta = Duration::try_from_secs_f64(
((current.total - current.processed) as f64) / per_second,
)
.ok();
let Some(remaining) = current.total.checked_sub(current.processed) else {
self.eta = None;
debug!(target: "reth::cli", ?current, "Failed to calculate the ETA: total entities is less than processed entities");
return
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add stage_id arg to Eta::update and log it as well to identify the source

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cli Related to the reth CLI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants