-
Notifications
You must be signed in to change notification settings - Fork 20.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geth missing blocks, rewind goes to genesis instead of last good block #26091
Comments
Sorry for it, it's a logic in Geth that if some blocks are missing in between then we can't simply find the parent block and thus interrupt the rewinding. It's expected behavior. We should investigate why the block bodies will be missing in the first place. |
thanks @rjl493456442!
I understand this is the way it was written. I don't understand why it needs to be this way though, if there is a last known "good" block, or even Geth could search for that itself. If at the end of the day the data can no longer be trusted, each block could probably be verified which would probably be faster than resyncing? Maybe something for the future.
If I can provide anything here, please let me know. Given that ~1.6TB of space was still persisted after the going back to 0, does that mean the resync is using some local data and is repopulating faster? Or it will all be overwritten as data comes in from peers? |
Also possibly related: #22374 |
Damn, sorry for dropping the ball on this. I hope you managed to full sync your node in the end. I'll close this issue for now. Please open a new one if this issue still persists |
System information
Geth version:
CL client & version:
gcr.io/prysmaticlabs/prysm/beacon-chain:v3.1.0
OS & Version:
Commit hash :
468d1844c7a32b51eebce6c5f35c44a66b9acf64
Expected behaviour
Over the past couple of months I have been running/syncing a
--syncmode=full
Geth node on EKS. Recently I was looking at block data and noticed that I am missing data for block 15,012,712 through 15,012,866. I'm not sure how this happened. There have been some unclean shutdowns of Geth in the past but I don't think these took place around the time those blocks were synced.Missing blocks:
The log doesn't output anything when I make getBlock calls for missing blocks. It does output some info when trying to get the logs for those blocks:
In an attempt to get data for those blocks, I looked for possible ways to fill in the gap. That didn't seem possible so I opted to "rewind" back to the point of the first missing block and then resync from there forward. To do this, I did the following:
--maxpeers=0
--no-discovery
At this point, I expected Geth to slowly make its way back to block 15,012,712. I'd then undo the peer/discovery changes and start syncing again.
Actual behaviour
The
setHead
command ran for a few hours before this appeared in the log:go-ethereum/core/blockchain.go
Lines 608 to 609 in 2415911
After that, disk continued to slowly free up and then stopped when the setHead command returned with
null
. In total, disk usage went from something like 2.1TB on disk to 1.6TB. At that point, I restarted Geth with peers and resyncs began from block 0. The sync does appear to be going fast-ish as it has made its way to 4MM and change ~12 hours later, but it is showing signs of slowing down.Steps to reproduce the behaviour
Hard to reproduce. I have EBS snapshots I could offer.
Backtrace
Restarting Geth, executing
setHead
command and waiting, and restarting Geth:...
The text was updated successfully, but these errors were encountered: