-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix consensus bug from deleting account in same block that creates it #1913
Conversation
f4cba85
to
86be04f
Compare
Well, making progress: Block 116526 is now importing! ... but it only got to 116530 before failing again 🤦♂️ I'll wait to figure that out before posting this for review. It's pretty concerning that we're getting all these consensus bugs while still passing the ethereum/tests! |
tests/core/vm/test_vm_state.py
Outdated
state.lock_changes() | ||
assert state.get_storage(ADDRESS, 0) == 0 | ||
|
||
# delete account in next "transaction" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: probably meant to say "revive" instead of "delete".
Block |
It's Although 86be04f does help I've made sure that my Trinity installation indeed has the changes from |
I think we may be near a point where we need to pay down tech debt in this area. It seems we've ended up with lots of layers here and I at least am having a lot of difficulty reasoning about how we manage all of the nuances related to updating account and contract state. |
No, you're right. I had imported it using the old code and skipped over it during sync. |
86be04f
to
36b0e5a
Compare
I'm also not happy with how complex the code is, and it being a pain to debug. Some key things that seem to be interacting in a complex way are:
With all those interacting factors above, I have no immediate idea how to make it better. One option is to push the complexity into |
But some good news! The latest push actually does import block 116525, and is at 153492 and continuing... |
Wow, a shocking number of consensus bugs that are not covered by ethereum/tests. Found the next failed import at block #336508. Pretty interesting, only the receipt root is broken:
|
Verified - indeed! JIC, I've checked with Reverting to known "good" Trinity and Re-setting to block In other words, the regression lies somewhere in the same |
I'm checking the intermediate
FTR,
As to what's going on: the most likely reason we're not getting a |
Looks like it's even simpler than that: |
Looks like it had to do with reverting the storage root back to the root at the beginning of the block, and not treating that as a change. By always treating the storage as changed, that fixes pre-Byzantium (but seems to break something in Istanbul's |
If the account is not created, and the new storage root is for empty storage, then we shouldn't create the account just to put a blank storage root in it. |
Up to block The full sync has tested some blocks from:
|
So maybe come back for a refactor later, after we get sync reliable and trinity to mvp. Thoughts @pipermerriam ? |
@cburgdorf Seems worth double-checking that this fix also fixes the goerli test you ran. How did you set that up again and/or are you up for re-running it against this branch? |
I'm on vacation without laptop but just running `trinity --goerli --sync-mode full` should be all you need. I don't know the exact block that
broke but if you get past 300k blocks it definitely fixed that bug. I'm
sure @veox can also assist.
…On Sat, Feb 1, 2020, 08:22 Jason Carver ***@***.***> wrote:
@cburgdorf <https://github.com/cburgdorf> Seems worth double-checking
that this fix also fixes the goerli test you ran. How did you set that up
again and/or are you up for re-running it against this branch?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1913?email_source=notifications&email_token=AAD7HFNXGJUW4OKDYVLD5GDRATFGVA5CNFSM4KNKPGKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKQPQTQ#issuecomment-580974670>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD7HFLTP7664SHRJAKUAMTRATFGVANCNFSM4KNKPGKA>
.
|
Yup! I'll check this on main-net (from scratch), and then try to sync Goerli - that's where I started, after all. |
Confirming that the issue is no longer present when using Main-net blocks So has Goerli block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
eth/db/storage.py
Outdated
@@ -120,7 +141,7 @@ def __delitem__(self, key: bytes) -> None: | |||
|
|||
@property | |||
def has_changed_root(self) -> bool: | |||
return self._write_trie and self._write_trie.root_hash != self._starting_root_hash | |||
return self._write_trie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming I'm correct that this is checking for the None
value, I'd advocate for this to be return self._write_trie is not None
.
eth/db/storage.py
Outdated
@@ -129,9 +150,9 @@ def get_changed_root(self) -> Hash32: | |||
raise ValidationError("Asked for changed root when no writes have been made") | |||
|
|||
def _clear_changed_root(self) -> None: | |||
self._starting_root_hash = self._write_trie.root_hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you get back in here could you maybe drop a comment or two in this method explaining in some detail why we do each of these things. Since it's not clear how to address some of the debt in the code, maybe we can address some of with with better comments.
eth/db/storage.py
Outdated
self._trie_nodes_batch: BatchDB = None | ||
|
||
# When deleting an account, push the pending write info onto this stack. | ||
# May happen multiple times for multiple deletes. | ||
self._historical_write_tries: List[Tuple[HexaryTrie, BatchDB, Hash32]] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make a NamedTuple
for the inner value type here. Probably makes some of the other code a bit easier to read/grok.
# Track how many times we have cleared the storage. This is journaled | ||
# so that we can revert to an old clear count. The clear count is used | ||
# as an index to find the base trie from before the revert. | ||
self._clear_count = JournalDB(MemoryDB({CLEAR_COUNT_KEY_NAME: to_bytes(0)})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the JournalDB
here feels like it may be the wrong choice. Can this just be modeled using a list (stack) or somethng?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the comment a bit to try to explain: it has to move in lockstep with the main storage journal, so we can detect when a revert crosses back over an account delete.
The bug was: when a previous transaction creates/modifies an account, and then is "locked", those pending changes are written even if the account is deleted is a subsequent transaction. The fix was: clear out all locked storage changes when you lock in an account deletion. This requires carefully managing reverts that cross over to the pre-delete state. So we track pending writes in a stack that can be reverted to at the appropriate time. --- It's important not to accidentally create empty accounts, so don't set empty storage root in a missing account. --- Start reporting a storage root as updated, even if the storage root is identical. Why? In pre-Byzantium, the storage root is reported at each transaction. If one transaction changes the storage and then another changes it back, then we need to treat the storage in the second transaction as changed, even though it's identical to the start root, which is always kept as the root at the beginning of the block.
30ee138
to
14a0785
Compare
What was wrong?
Fix #1912
How was it fixed?
The bug was: when a previous transaction creates/modifies an account,
and then is "locked", those pending changes are written at the end of the block, even if the
account is deleted is a subsequent transaction.
The fix was: clear out all "locked" storage changes (those made during previous transactions) when you lock in an account deletion.
That fix brought up a new interesting case, where the "dirty" flag wasn't noticing when an account started the block as empty, then some storage was added in one transaction, then in a later transaction it was deleted again. The further solution was to track a history of of storage tries that can be reverted to (a new one added on every account delete), and generously treat the storage as dirty if there were ever any writes in the block, even if the trie currently has the same state root as the root was at the beginning of the block.
To-Do
Cute Animal Picture