Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth_getProof implementation returns invalid hashes #28441

Closed
Wollac opened this issue Oct 31, 2023 · 7 comments
Closed

eth_getProof implementation returns invalid hashes #28441

Wollac opened this issue Oct 31, 2023 · 7 comments
Labels

Comments

@Wollac
Copy link

Wollac commented Oct 31, 2023

Description

When eth_getProof is called for an account, that does not exist, it returns null hashes for the code and the storage hash. This is inconsistent with the behavior of existing simple accounts and was apparently introduced with #28357. Previously, codeHash consistently returned the hash of the empty string 0xc5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470 and storageHash returned the empty MPT root hash 0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421.

Steps to reproduce the behaviour

$ curl localhost:8545 \ 
      -X POST \
      -H "Content-Type: application/json" \
      -d '{ "method": "eth_getProof", "params": ["0x0010000000000000000000000000000000000000", [], "latest"], "id": 1, "jsonrpc": "2.0" }'

Actual behaviour

The previous command returns

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "address": "0x0010000000000000000000000000000000000000",
    "accountProof": [
      ...
    ],
    "balance": "0x0",
    "codeHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
    "nonce": "0x0",
    "storageHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
    "storageProof": []
  }
}

Expected behaviour

The previous command should return the correct hash for "empty" data:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "address": "0x0010000000000000000000000000000000000000",
    "accountProof": [
      ...
    ],
    "balance": "0x0",
    "codeHash": "0xc5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470",
    "nonce": "0x0",
    "storageHash": "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
    "storageProof": []
  }
}
@rjl493456442
Copy link
Member

rjl493456442 commented Nov 7, 2023

This is inconsistent with the behavior of existing simple accounts

Instead, I think it's pretty good to distinguish "empty account" and "non-existent account".

  • If the account is existent but with empty code and empty storage, then 0xc5d24...d85a470 and 0x56e8...63b421 are returned
  • If the account is not existent, the code hash and storage root should be null with a "account non-exsitent" proof

It makes sense to me.

@holiman
Copy link
Contributor

holiman commented Nov 7, 2023

I think it works as expected

@Wollac
Copy link
Author

Wollac commented Nov 8, 2023

@rjl493456442 I don't think I understand your assessment for the following reasons:

  • This is otherwise a breaking change, that was introduced by @holiman fix internal/ethapi: fix codehash lookup in eth_getProof #28357 in v1.13.4. Previously, 0xc5d24...d85a470 and 0x56e8...63b421 were returned exactly as described in my issue. Therefore, this new behavior breaks applications that have relied on such results for the last 5 years.
  • More importantly, I believe that the new behavior is inconsistent with EIP-1186 describing this API:
    • codeHash: DATA, 32 Bytes - hash of the code of the account. For a simple Account without code it will return "0xc5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470"

      The EIP does not mention the new special case of empty accounts having the null hash. Although it, admittedly, doesn't explicitly say the opposite either, it still mentions "hash of the code" and aims to be consistent with eth_getCode. In this case eth_getCode returns 0x. So my consistent interpretation of the EIP would be to return keccak256(0x) and not some special value.

    • And perhaps most importantly, storageHash was explicitly introduced in this API by the EIP to allow for Merkle (non-)inclusion proofs, stating "All storage will deliver a MerkleProof starting with this rootHash". When this API is used as intended to prove the non-inclusion of a storage key, e.g. 0x0, for a non-existent account, then this statement is technically no longer correct: The API returns an empty "proof trie" in storageProof that has a root hash of 0x56e8...63b421. This no longer matches the supplied storageHash of 0x0000...0000, and such a discrepancy usually indicates an invalid proof.

Of course, it would be easy for an application using this API to introduce special handling to be compatible with the previous behavior, and it might even be useful in some cases to maintain the distinction between "empty account" and "non-existent account". However, this seems to me to be a geth-specific response that is not covered by the EIP, and thus could cause compatibility issues with applications or other node/protocol implementations.

@holiman
Copy link
Contributor

holiman commented Nov 8, 2023

You are reasoning as if there's nothing in particular different between 'empty account' and 'non-existent account'. IMO there's a world of difference,

  • the latter does not exist in the trie. Serving an emptyCodeHash would be a straight-up lie, since there is no such emptyCodeHash at the corresponding place in the account trie. Therefore, the null-hash is the only proper answer.
  • As for what we do / should serve in the case of exclusion proofs, I'll have to look into it. I have a slight recollection, from when doing internal/ethapi: fix codehash lookup in eth_getProof #28357, that the exclusion proof logic was a bit strange.

@jsvisa
Copy link
Contributor

jsvisa commented Nov 9, 2023

FYI, test the non-existent account on the cloud provider or other clients, responses are as below:

Provider codeHash storageHash
Alchemy 0x0000000000000000000000000000000000000000000000000000000000000000 0x0000000000000000000000000000000000000000000000000000000000000000
Infura 0xc5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470 0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421
ChainBase 0x0000000000000000000000000000000000000000000000000000000000000000 0x0000000000000000000000000000000000000000000000000000000000000000
erigon 0x0000000000000000000000000000000000000000000000000000000000000000 0x0000000000000000000000000000000000000000000000000000000000000000
reth RPC NOT SUPPORTED YET

@rjl493456442
Copy link
Member

@Wollac

This is otherwise a breaking change, that was introduced by @holiman fix internal/ethapi: fix codehash lookup in eth_getProof #28357 in v1.13.4. Previously, 0xc5d24...d85a470 and 0x56e8...63b421 were returned exactly as described in my issue. Therefore, this new behavior breaks applications that have relied on such results for the last 5 years.

Yeah, we apologize for the breaking change, but I believe the new behavior is more correct.

So my consistent interpretation of the EIP would be to return keccak256(0x0) and not some special value.

I think we should improve the EIP by clarifing this corner case. But I don't think it makes sense to return a value(keccak256(0x0)) for non-existent account(null is expected).

for a non-existent account, then this statement is technically no longer correct: The API returns an empty "proof trie" in storageProof that has a root hash of 0x56e8...63b421. This no longer matches the supplied storageHash of 0x0000...0000, and such a discrepancy usually indicates an invalid proof.

For this storage trie(two layer structure), I think the storage slot verification involves two steps: (1) verify the account is existent and storage trie is not empty (2) verify the existence or inexistence of storage slot.

In the case you mentioned, we can clearly prove this account is not existent, then obviously the step (2) is no longer required.

However, this seems to me to be a geth-specific response that is not covered by the EIP, and thus could cause compatibility issues with applications or other node/protocol implementations.

It's a fair good point. In my opinion, we should try to clarify EIP and align other client implementations with this behavioral definition.

@Wollac
Copy link
Author

Wollac commented Nov 9, 2023

My main point was the discrepancy between the new behavior and the EIP, which IMHO clearly did not make a distinction between empty and non-existent. And that this could cause implementations and applications (like ours) that followed that EIP to now diverge. So, as long as the EIP is updated or the new expected behavior is clearly documented somewhere, I'd consider this issue resolved.

That being said, I still believe that from a mathematical/formal point of view the codeHash returned for any address A should always match keccak256(eth_getCode(A)). Since the "hash of the code" is a formula and not a DB value that defaults to a (mathematically arbitrary) value, this seems to be the most consistent behavior for a general EIP. (And analogously for storageProof.) In this respect, and since you are obviously right that calling eth_getProof for a non-existent account is a corner case, another consistent alternative would be to then not include any balance, codeHash, nonce, storageHash, storageProof, since the entire account is not well defined in this case, and these values are irrelevant for the corresponding non-inclusion proof. (Unfortunately, I don't think such a response would be feasible or compatible with the API design, though.)

Of course, these are just my subjective observations and remarks, and I see that you are coming more from a concrete go-/geth-developer point of view. However, I hope that I could maybe explain my slightly different perspective on what I think the perfect API could be 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants