Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tipset CID flakiness #10061

Closed
raulk opened this issue Jan 19, 2023 · 5 comments · Fixed by #10069
Closed

Tipset CID flakiness #10061

raulk opened this issue Jan 19, 2023 · 5 comments · Fixed by #10069
Assignees

Comments

@raulk
Copy link
Member

raulk commented Jan 19, 2023

Original report

There is some flakiness in persisting tipset CIDs, which makes the Ethereum JSON-RPC fail in various ways. This issue is affecting the Hyperspace testnet.

This was reported in Slack thread: https://filecoinproject.slack.com/archives/C04JEJB82RY/p1674116707854229

Debugging

In order to debug this, I synced a Hyperspace node from genesis. It finished syncing at around height 9459.

Using the tool contributed in #10060, I generated a report to verify that the eth_getBlockByHash and eth_getBlockByNumber operations were returning consistent results. That is, when returning the block at a given height (eth_getBlockByNumber), we can also fetch the same block by the returned block hash.

Attached is my report, walking backwards from my head until tipset 6417.

Observations

  • It appears that all tipsets synced during catch-up are correctly retrievable.
  • There are a few exceptions, e.g. at height 7147, but that's because we're falling into here or here. This is a separate issue I will be debugging after this one.
  • Unfortunately, all new tipsets received during live syncing don't appear to be retrievable by hash. This likely means that we are either not persisting the TipsetKey (TipsetCID) or we are now calculating the hash wrong. This is a regression introduced in the last days.
@raulk
Copy link
Member Author

raulk commented Jan 19, 2023

The next thing I will do is use set-head to force a rewind and see if resyncing this portion of the chain in "catch-up" mode fixes the problem.

@raulk
Copy link
Member Author

raulk commented Jan 19, 2023

Note that I haven't activated the transaction hash index, but I don't think that should matter?

@raulk
Copy link
Member Author

raulk commented Jan 19, 2023

The next thing I will do is use set-head to force a rewind and see if resyncing this portion of the chain in "catch-up" mode fixes the problem.

Indeed, tipset CIDs are only added during catch-up syncing.

Current height: 9699
[FAIL] failed to get tipset @9699 via eth_getBlockByHash: error loading tipset <nil>: cannot find tipset with cid bafy2bzaceaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa: ipld: could not find bafy2bzaceaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
[FAIL] failed to get tipset @9698 via eth_getBlockByHash: error loading tipset <nil>: cannot find tipset with cid bafy2bzacea7rnmb7hdgscih5z6fets6jyrbap5pkqdoyzvfalybowg36lgimi: ipld: could not find bafy2bzacea7rnmb7hdgscih5z6fets6jyrbap5pkqdoyzvfalybowg36lgimi
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9697 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9696 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9695 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9694 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9693 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9692 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9691 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9690 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9689 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9688 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9687 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9686 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9685 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9684 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9683 are identical
[OK] blocks received via eth_getBlockByNumber and eth_getBlockByHash for tipset @9682 are identical

@raulk
Copy link
Member Author

raulk commented Jan 19, 2023

Note that all hosted RPC endpoint providers are having the same issues with the transaction hash index activated, so it's definitely not related.

@raulk
Copy link
Member Author

raulk commented Jan 19, 2023

Possibly a regression caused by #9904.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants