Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verification pipeline errors can persist through multiple sync restarts #1590

Closed
teor2345 opened this issue Jan 14, 2021 · 1 comment
Closed
Labels
A-rust Area: Updates to Rust code C-bug Category: This is a bug

Comments

@teor2345
Copy link
Contributor

teor2345 commented Jan 14, 2021

Version

zebrad 1.0.0-alpha.0 commit 92d95d4 + PR #1589

Platform

Linux ... 5.4.83 #1-NixOS SMP Fri Dec 11 12:23:33 UTC 2020 x86_64 GNU/Linux

Description

When verification times out (see #1586), it can take multiple sync restarts before all the duplicate block errors clear the verification pipeline.

I tried this:

Running zebrad on testnet, syncing from genesis

I expected to see this happen:

If block verification timed out, there would be a single error, then syncing would resume smoothly

Instead, this happened:

A block verification timed out, leading to a series of errors that persisted for 6 sync restarts (around 7 minutes). During this time, it still synced 2700 blocks.

After that, the syncer recovered, syncing new blocks without errors.

Suggested Solutions

  • Make the syncer ignore more errors as duplicate block errors
  • Clear the checkpoint queue, non-finalized state queue, and finalized state queue on sync restart
  • Add heights to all of these error messages
  • Just ignore the issue, as long as
    • the syncer is still making progress
    • the errors don't persist for more than an hour

Commands

zebrad start on testnet

Logs

Sync progressing ok
Height: ..=280000

``` Jan 14 13:20:43.113 INFO {zebrad="92d95d4b" net="Test"}:sync:extend_tips:checkpoint: zebra_consensus::checkpoint: verified checkpoint range block_count=400 current_range=(Excluded(Height(279600)), Included(Height(280000))) Jan 14 13:20:43.114 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=2000 lookahead_limit=2000 Jan 14 13:20:44.619 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting for pending blocks tips.len=1 in_flight=2371 lookahead_limit=2000 Jan 14 13:20:44.620 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=2000 lookahead_limit=2000 Jan 14 13:20:45.647 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting for pending blocks tips.len=1 in_flight=2371 lookahead_limit=2000 Jan 14 13:20:45.647 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=2000 lookahead_limit=2000 Jan 14 13:20:45.734 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting for pending blocks tips.len=1 in_flight=2371 lookahead_limit=2000 Jan 14 13:20:45.735 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=2000 lookahead_limit=2000 Jan 14 13:20:46.307 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting for pending blocks tips.len=1 in_flight=2418 lookahead_limit=2000 ```

Initial Elapsed error
Height: ?

Jan 14 13:21:48.733  INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound:state: zebra_state::service: responding to peer GetBlocks or GetHeaders final_height=Height(499) response_len=500 chain_tip_height=Height(280000) stop_height=None intersection_height=None
Jan 14 13:23:43.037  WARN {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: error downloading and verifying block e=Elapsed(())                                                   
Jan 14 13:23:43.037  INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s                                                                   

Sync restart
Height: 280000..=282404

``` Jan 14 13:24:02.738 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound:state: zebra_state::service: responding to peer GetBlocks or GetHeaders f inal_height=Height(499) response_len=500 chain_tip_height=Height(280000) stop_height=None intersection_height=None Jan 14 13:24:44.041 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: starting sync, obtaining new tips Jan 14 13:24:44.041 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(280000) min_locator_height=279901 locators=[Height( 280000), Height(279999), Height(279998), Height(279996), Height(279992), Height(279984), Height(279968), Height(279936), Height(279901)] Jan 14 13:24:44.042 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 2 peers to answer requests Jan 14 13:24:44.945 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=372 lookahead_limit=2000 Jan 14 13:24:45.027 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=743 lookahead_limit=2000 Jan 14 13:24:45.107 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=1186 lookahead_limit=2000 Jan 14 13:24:45.188 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=1684 lookahead_limit=2000 ```

Subsequent AlreadyInChain error and sync restart
Height: 282404

Jan 14 13:24:45.282  WARN {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: error downloading and verifying block e=Block(Block { source: AlreadyInChain(block::Hash("00065cc5cce472e431797761264cbc7c701eea8bd45b26d2440ae27ad7a1491d"), 2123) })
Jan 14 13:24:45.282  INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s                                                                   

A series of sync restarts due to AlreadyInChain and block is already committed to the state errors
Height: 282404..=282727

``` Jan 14 13:25:09.741 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound:state: zebra_state::service: responding to peer GetBlocks or GetHeaders f inal_height=Height(499) response_len=500 chain_tip_height=Height(282404) stop_height=None intersection_height=None Jan 14 13:25:46.285 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: starting sync, obtaining new tips Jan 14 13:25:46.285 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(282404) min_locator_height=282305 locators=[Height( 282404), Height(282403), Height(282402), Height(282400), Height(282396), Height(282388), Height(282372), Height(282340), Height(282305)] Jan 14 13:25:52.048 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 2 peers to answer requests Jan 14 13:25:57.194 WARN {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: error downloading and verifying block e=Block(Block { source: AlreadyInChain(block::Hash("000bfa422c8e 077442e3c1bdb26de23db547cb0cf3980707e58741f6261352c9"), 18) }) Jan 14 13:25:57.194 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s Jan 14 13:26:16.743 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound: zebra_state::service: pruned utxo requests old_len=16519 new_len=0 prune _count=16519 tip=Some((Height(282443), block::Hash("00091cde413748299e08a0fdee21d1a770068d8ad8873f4eb2ca0b487df81ff1"))) Jan 14 13:26:16.749 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound:state: zebra_state::service: responding to peer GetBlocks or GetHeaders f inal_height=Height(499) response_len=500 chain_tip_height=Height(282443) stop_height=None intersection_height=None Jan 14 13:26:58.195 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: starting sync, obtaining new tips Jan 14 13:26:58.195 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(282443) min_locator_height=282344 locators=[Height( 282443), Height(282442), Height(282441), Height(282439), Height(282435), Height(282427), Height(282411), Height(282379), Height(282344)] Jan 14 13:27:02.848 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 2 peers to answer requests Jan 14 13:27:22.172 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound:state: zebra_state::service: responding to peer GetBlocks or GetHeaders final_height=Height(499) response_len=500 chain_tip_height=Height(282491) stop_height=None intersection_height=None Jan 14 13:27:37.179 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips: zebra_state::service: pruned utxo requests old_len=17848 new_len=83 prune_count=17765 tip=Some((Height(282491), bloc k::Hash("0013da35d7196632f70cfa9fcc6e22906db20d21430ec08cfd2ba231bfd594b4"))) Jan 14 13:27:39.222 WARN {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: error downloading and verifying block e=Block(Commit("block is already committed to the state")) Jan 14 13:27:39.222 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s Jan 14 13:28:28.971 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound: zebra_state::service: pruned utxo requests old_len=5136 new_len=0 prune_ count=5136 tip=Some((Height(282504), block::Hash("000ab81fa8ef40725f90309ee1bc351258d10492272a1432f8bab391db33da4e"))) Jan 14 13:28:28.977 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound:state: zebra_state::service: responding to peer GetBlocks or GetHeaders f inal_height=Height(499) response_len=500 chain_tip_height=Height(282504) stop_height=None intersection_height=None Jan 14 13:28:40.222 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: starting sync, obtaining new tips Jan 14 13:28:40.222 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(282504) min_locator_height=282405 locators=[Height( 282504), Height(282503), Height(282502), Height(282500), Height(282496), Height(282488), Height(282472), Height(282440), Height(282405)] Jan 14 13:28:46.233 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 2 peers to answer requests Jan 14 13:28:51.818 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=499 lookahead_limit=2000 Jan 14 13:28:55.155 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=994 lookahead_limit=2000 Jan 14 13:28:56.198 WARN {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: error downloading and verifying block e=Block(Commit("block is already committed to the state")) Jan 14 13:28:56.198 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s Jan 14 13:29:16.182 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=54.209.49.207:18233}:msg_as_req{msg=inv}:inbound:download_and_verify{hash=0009f4fbfd3786e9cad9ffe3202e08ea4146d5f938631f9a37 3520a0b3d4d97f}: zebra_state::service: pruned utxo requests old_len=33417 new_len=0 prune_count=33417 tip=Some((Height(282608), block::Hash("000ac4760cbe23ed74409974861cacaa1ae0f461b9a29fea16 9c4c3b45720dcb"))) Jan 14 13:29:35.978 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound:state: zebra_state::service: responding to peer GetBlocks or GetHeaders f inal_height=Height(499) response_len=500 chain_tip_height=Height(282608) stop_height=None intersection_height=None Jan 14 13:29:57.199 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: starting sync, obtaining new tips Jan 14 13:29:57.199 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(282608) min_locator_height=282509 locators=[Height( 282608), Height(282607), Height(282606), Height(282604), Height(282600), Height(282592), Height(282576), Height(282544), Height(282509)] Jan 14 13:30:03.208 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 2 peers to answer requests Jan 14 13:30:08.773 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=499 lookahead_limit=2000 Jan 14 13:30:12.720 WARN {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: error downloading and verifying block e=Block(Commit("block is already committed to the state")) Jan 14 13:30:12.720 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s ```

Sync recovers
Height: 282727..

``` Jan 14 13:30:27.409 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=54.209.49.207:18233}:msg_as_req{msg=inv}:inbound:download_and_verify{hash=0035c76e7602af7b9553d25b3f063157494290b97797df5a4c 49542e2adec631}: zebra_state::service: pruned utxo requests old_len=25777 new_len=0 prune_count=25777 tip=Some((Height(282727), block::Hash("000eebf1db85524927454892fa5f5feb9bd3dc3b07e8ca2723 6f5b475a2e8290"))) Jan 14 13:30:42.980 INFO {zebrad="92d95d4b" net="Test"}:peer{addr=127.0.0.1:38233}:msg_as_req{msg=getblocks}:inbound:state: zebra_state::service: responding to peer GetBlocks or GetHeaders f inal_height=Height(499) response_len=500 chain_tip_height=Height(282727) stop_height=None intersection_height=None Jan 14 13:31:13.722 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: starting sync, obtaining new tips Jan 14 13:31:13.722 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(282727) min_locator_height=282628 locators=[Height( 282727), Height(282726), Height(282725), Height(282723), Height(282719), Height(282711), Height(282695), Height(282663), Height(282628)] Jan 14 13:31:13.722 INFO {zebrad="92d95d4b" net="Test"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 2 peers to answer requests Jan 14 13:31:41.363 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=499 lookahead_limit=2000 Jan 14 13:31:43.657 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=993 lookahead_limit=2000 Jan 14 13:31:43.948 INFO {zebrad="92d95d4b" net="Test"}:sync:extend_tips: zebra_state::service: pruned utxo requests old_len=18896 new_len=123 prune_count=18773 tip=Some((Height(282751), blo ck::Hash("000289add01bafe6d869877c2c1d6a2908464f2e18bb8a6960795bbd2cc43826"))) Jan 14 13:31:45.273 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=1489 lookahead_limit=2000 Jan 14 13:31:46.357 INFO {zebrad="92d95d4b" net="Test"}:sync: zebrad::components::sync: extending tips tips.len=1 in_flight=1976 lookahead_limit=2000 ... ```
@teor2345 teor2345 added C-bug Category: This is a bug S-needs-triage Status: A bug report needs triage A-rust Area: Updates to Rust code labels Jan 14, 2021
@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label Feb 2, 2021
@teor2345
Copy link
Contributor Author

I haven't seen this bug for a while, so let's close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rust Area: Updates to Rust code C-bug Category: This is a bug
Projects
None yet
Development

No branches or pull requests

2 participants