Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix frequent Zebra hangs during syncing #3322

Closed
12 of 13 tasks
teor2345 opened this issue Jan 6, 2022 · 3 comments
Closed
12 of 13 tasks

Fix frequent Zebra hangs during syncing #3322

teor2345 opened this issue Jan 6, 2022 · 3 comments
Labels
A-diagnostics Area: Diagnosing issues or monitoring performance C-security Category: Security issues I-hang A Zebra component stops responding to requests S-needs-investigation Status: Needs further investigation

Comments

@teor2345
Copy link
Contributor

teor2345 commented Jan 6, 2022

Motivation

Zebra is currently hanging during about 5% - 90% of syncs, depending on the machine and network. But Zebra doesn't seem to hang after the initial sync.

This Epic contains bugs that might be causing these hangs.

But it's also possible that:

  • bugs we're currently working on will fix the hangs, or
  • other undiscovered bugs are causing the hangs.

So we don't want to add these bugs to #3096 yet.

Tasks

First, fix the bugs we are currently working on, and see if Zebra stops hanging all the time:

If Zebra keeps on hanging frequently, implement these diagnostics:

Then use the diagnostics to:

  • prioritise the bugs in this epic, or
  • open new bugs for any issues reported by tokio-console.

Likely Causes

This code has caused hangs before:

@teor2345 teor2345 added S-needs-triage Status: A bug report needs triage S-needs-investigation Status: Needs further investigation C-security Category: Security issues I-hang A Zebra component stops responding to requests A-diagnostics Area: Diagnosing issues or monitoring performance Epic Zenhub Label. Denotes a theme of work under which related issues will be grouped labels Jan 6, 2022
@mpguerra
Copy link
Contributor

Do we need to review this issue and its associated potential fixes given that it seems we've fixed some hangs for now...

@mpguerra mpguerra removed the Epic Zenhub Label. Denotes a theme of work under which related issues will be grouped label Jan 25, 2022
@conradoplg
Copy link
Collaborator

I think we can close this after #3234 is completed

@teor2345
Copy link
Contributor Author

I'm not seeing any hangs after testing the latest main on 4 different mainnet and testnet instances, so I'm happy to just close this now.

@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Diagnosing issues or monitoring performance C-security Category: Security issues I-hang A Zebra component stops responding to requests S-needs-investigation Status: Needs further investigation
Projects
None yet
Development

No branches or pull requests

3 participants