-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core, light: get rid of the dual mutexes, hard to reason with #18436
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Haven't reviewed the code yet, but a fast-sync was completed with this PR without problems on |
Full sync at |
Full sync at |
Full sync at |
How can I repair it? |
Even with the update, I get the same error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, will the following scenario cause a dead lock? though as the comment says, it only happens if the database is corrupted or empty
bc.SetHead -> bc.loadLastState -> bc.Reset -> bc.ResetWithGenesisBlock -> bc.SetHead
Possibly fixes #18421 and #18388.
The block chain and light chain currently have two mutexes that protect its internal state:
mu
andchainmu
. Themu
supposed to protect internal state from getting corrupted whilstchaimu
from multiple modifications to the chain. However, modifying the internal state is equivalent to modifying the chain.Most of our code ended up only locking
mu
(e.g. Rollback, SetHead), which although weren't chain insertions, they could really screw up the inserter. The reason this didn't blow up earlier is because most of these method calls are serialized by the downloader, hiding the ticking time bomb. Apparently a side chain import PR managed to destabilize the event orders and somehow triggered a chain modification without lockingmu
.