Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eta calculator fixes #2735

Merged
merged 4 commits into from
Sep 19, 2023
Merged

Eta calculator fixes #2735

merged 4 commits into from
Sep 19, 2023

Conversation

daniel-savu
Copy link
Contributor

@daniel-savu daniel-savu commented Sep 15, 2023

Bug 1

Closes #2723

The relayer panic is caused by an overflow, bc of dividing by ~6.540888459481895e-211. On my local, the effective rate of indexing starts at 0.61.

{"timestamp":"2023-09-15T09:57:10.746276Z","level":"INFO","fields":{"message":"~~~ blocks_processed: 2508, tip_progression: 2042, elapsed: 757.10340475, old_rate: Some(0.6155037701275111), effective_rate: 0.6155037701275111"},"target":"hyperlane_base::contract_sync::eta_calculator","span":{"domain":"solanadevnet","label":"gas_payments","name":"ContractSync"},"spans":[{"domain":"solanadevnet","label":"gas_payments","name":"ContractSync"}]}

But then both the blocks_processed and the tip_progression are consistently zero, which makes the new_rate be zero (

let new_rate = (blocks_processed - tip_progression) / elapsed;
), and over time it takes over the entire moving average window to make it almost zero, leading to an overflow. 15 mins after that initial log, the effective rate already became 0.00038.

The reason for blocks_processed and tip_progression consistently being zero after the first log is that eta_calculator.calculate(from, tip) is always called with the same from and tip although it expects to get the latest values.

The fix

the tip wasn't being set after the sequence_and_tip query here:

let (count, tip) = self.indexer.sequence_and_tip().await?;

And then the to and from are calculated based on it:

So even though the sync_state internal variables were kept up-to-date, the min(...) would cause the issue.

The first PR commit fixes this.

Bug 2

There was another bug in the eta calculator, caused by it only using next_block to approximate synced state, which doesn't apply to sequence indexing. The way the eta calculator is called had to be changed to use abstract measures of sync progression (which could be blocks or sequences).

The second PR commit fixes this, afaict.

@codecov
Copy link

codecov bot commented Sep 15, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (7ab7cf5) 64.16% compared to head (7046cbf) 64.16%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2735   +/-   ##
=======================================
  Coverage   64.16%   64.16%           
=======================================
  Files          92       92           
  Lines        1398     1398           
  Branches      186      186           
=======================================
  Hits          897      897           
  Misses        494      494           
  Partials        7        7           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@mattiekat mattiekat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@daniel-savu daniel-savu enabled auto-merge (squash) September 19, 2023 14:45
@daniel-savu daniel-savu merged commit d65c4e7 into main Sep 19, 2023
@daniel-savu daniel-savu deleted the dan/eta-overflow-fix branch September 19, 2023 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
2 participants