Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join mainnet fail with panic "Wrong Block.Header" #4254

Closed
4 tasks
garyyu opened this issue May 2, 2019 · 21 comments
Closed
4 tasks

Join mainnet fail with panic "Wrong Block.Header" #4254

garyyu opened this issue May 2, 2019 · 21 comments

Comments

@garyyu
Copy link

garyyu commented May 2, 2019

Summary of Bug

Follow the Join Mainnet doc but always fail with panic Wrong Block.Header.AppHash:

I[2019-05-02|13:22:35.897] Executed block                               module=state height=1 validTxs=0 invalidTxs=0
I[2019-05-02|13:22:40.438] Committed state                              module=state height=1 txs=0 appHash=442D1FDAF0435EC2BF9884D3A0D222FFCF1EF1C4B8FBA1B9775E1A6501247E78
panic: Failed to process committed block (2:6874A607AABF97B8D0627ADDF4B7501CCE74ECAB82E44A6C4F39A4562D75601B): Wrong Block.Header.AppHash.  Expected 442D1FDAF0435EC2BF9884D3A0D222FFCF1EF1C4B8FBA1B9775E1A6501247E78, got 056A9CA652FC5DD667A19362081216A57A70F87A256FD62B3131673BCDBD969B

Version

$ gaiad version --long
cosmos-sdk: 0.34.3-80-gf4a96fd6
git commit: f4a96fd6b65ff24d0ccfe55536a2c3d6abe3d3fa
go.sum hash: 
build tags: netgo ledger
go version go1.12.4 linux/amd64

Steps to Reproduce

Genesis:

$ curl https://raw.githubusercontent.com/cosmos/launch/master/genesis.json > $HOME/.gaiad/config/genesis.json

~/.gaiad$ shasum -a 256 config/genesis.json

1e349fb39b85f7707ee78d39879f9d5d61f4d30f67980bb0bf07bd35b2f8bf30  config/genesis.json

Seed config:

# Comma separated list of seed nodes to connect to
seeds = "ba3bacc714817218562f743178228f23678b2873@public-seed-node.cosmoshub.certus.one:26656"

It's one of seeds provided here


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@alexanderbez
Copy link
Contributor

alexanderbez commented May 2, 2019

Thanks for opening up an issue @garyyu. Based off of the steps you've listed, everything looks to be in order (correct genesis, genesis hash, file locations, and gaiad version). Perhaps the seed nodes you're connecting to are still running an old version/network?

Can you try using the following seeds:

# Comma separated list of seed nodes to connect to
seeds = "[email protected]:26656,[email protected]:26656,[email protected]:26656,[email protected]:26656"

@garyyu
Copy link
Author

garyyu commented May 3, 2019

thanks @alexanderbez , I replace seeds with yours, but peers number is 0 even waiting a few minutes:

$ curl -s localhost:26657/net_info?

{
  "jsonrpc": "2.0",
  "id": "",
  "result": {
    "listening": true,
    "listeners": [
      "Listener(@)"
    ],
    "n_peers": "0",
    "peers": []
  }
}

I can confirm all these seeds are reachable:

$ nc -v -t -z -w 3  3.87.179.235 26656
Connection to 3.87.179.235 26656 port [tcp/*] succeeded!

$ nc -v -t -z -w 3  173.212.199.36 26656
Connection to 173.212.199.36 26656 port [tcp/*] succeeded!

$ nc -v -t -z -w 3  91.205.173.168 26656
Connection to 91.205.173.168 26656 port [tcp/*] succeeded!

$ nc -v -t -z -w 3  34.65.6.52 26656
Connection to 34.65.6.52 26656 port [tcp/*] succeeded!

The error logs:

E[2019-05-02|23:52:08.963] Connection failed @ recvRoutine (reading byte) module=p2p [email protected]:26656 conn=MConn{173.212.199.36:26656} err=EOF
E[2019-05-02|23:52:08.963] Stopping peer for error                      module=p2p peer="Peer{MConn{173.212.199.36:26656} 2626942148fd39830cb7a3acccb235fab0332d86 out}" err=EOF
E[2019-05-02|23:52:08.964] MConnection flush failed                     module=p2p [email protected]:26656 err="write tcp 45.118.135.254:52148->173.212.199.36:26656: use of closed network connection"
E[2019-05-02|23:52:39.680] Connection failed @ recvRoutine (reading byte) module=p2p [email protected]:26656 conn=MConn{91.205.173.168:26656} err=EOF
E[2019-05-02|23:52:39.681] Stopping peer for error                      module=p2p peer="Peer{MConn{91.205.173.168:26656} 3028c6ee9be21f0d34be3e97a59b093e15ec0658 out}" err=EOF
E[2019-05-02|23:53:09.681] Connection failed @ recvRoutine (reading byte) module=p2p [email protected]:26656 conn=MConn{91.205.173.168:26656} err=EOF
E[2019-05-02|23:53:09.681] Stopping peer for error                      module=p2p peer="Peer{MConn{91.205.173.168:26656} 3028c6ee9be21f0d34be3e97a59b093e15ec0658 out}" err=EOF
...

And here is my node status:

$ curl -s localhost:26657/status

{
  "jsonrpc": "2.0",
  "id": "",
  "result": {
    "node_info": {
      "protocol_version": {
        "p2p": "7",
        "block": "10",
        "app": "0"
      },
      "id": "9fc5a29db77616893a44a9a6da368cadee9dfcfd",
      "listen_addr": "tcp://0.0.0.0:26656",
      "network": "cosmoshub-2",
      "version": "0.31.5",
      "channels": "4020212223303800",
      "moniker": "gotts",
      "other": {
        "tx_index": "on",
        "rpc_address": "tcp://0.0.0.0:26657"
      }
    },
    "sync_info": {
      "latest_block_hash": "",
      "latest_app_hash": "",
      "latest_block_height": "0",
      "latest_block_time": "1970-01-01T00:00:00Z",
      "catching_up": true
    },
    "validator_info": {
      "address": "070329CF008596A6994136DB2EF92E4218A69629",
      "pub_key": {
        "type": "tendermint/PubKeyEd25519",
        "value": "03UQhnyQrN7xA88spx7AQu94CYFKw6VHdL1L3MvG9lY="
      },
      "voting_power": "0"
    }
  }
}

Any further suggestions?

@alexanderbez
Copy link
Contributor

alexanderbez commented May 3, 2019

Hmmm, I'm not too sure. Most often times those errors are benign and eventually your node will gossip to the correct peers. Any luck?

@garyyu
Copy link
Author

garyyu commented May 3, 2019

Ah, just saw it (after half an hour) fail again for same error as before: Wrong Block.Header.AppHash.

...
E[2019-05-03|00:30:41.408] Dialing failed                               module=pex [email protected]:26656 err="dial tcp 54.93.238.50:26656: i/o timeout" attempts=0
I[2019-05-03|00:30:43.349] Executed block                               module=state height=1 validTxs=0 invalidTxs=0
I[2019-05-03|00:30:47.223] Committed state                              module=state height=1 txs=0 appHash=442D1FDAF0435EC2BF9884D3A0D222FFCF1EF1C4B8FBA1B9775E1A6501247E78
panic: Failed to process committed block (2:6874A607AABF97B8D0627ADDF4B7501CCE74ECAB82E44A6C4F39A4562D75601B): Wrong Block.Header.AppHash.  Expected 442D1FDAF0435EC2BF9884D3A0D222FFCF1EF1C4B8FBA1B9775E1A6501247E78, got 056A9CA652FC5DD667A19362081216A57A70F87A256FD62B3131673BCDBD969B

@garyyu
Copy link
Author

garyyu commented May 3, 2019

Any further suggestions please?
I can't start a full node with the documented set-up procedure.

@alexanderbez
Copy link
Contributor

@garyyu we've had other instances of similar issues. It's almost certainly to do with your seeds/peers, since you are on the correct version. I'll try to find some good seeds or persistent peers for you.

@1ultimat3
Copy link

I have the same issue. To get a mainnet full node running is currently only possible by checking out tag v0.34.3

@garyyu
Copy link
Author

garyyu commented May 7, 2019

Thanks @mateuszk87, indeed! When I checkout tag v0.34.3 and call init again, it works 👍

So, is this a bug on latest master branch?

@alexanderbez
Copy link
Contributor

alexanderbez commented May 7, 2019

Hmmm, @mateuszk87 @garyyu, there must be a misunderstanding. Perhaps the docs are wrong, but I recently think we fixed and deployed docs. You should never run any code off of master. The master branch is not stable, but is our canonical development branch. All our releases are tagged and you should always use those (eg. v0.34.3).

Sorry for any confusion. If our docs still say master, please let us know.

@posa88
Copy link

posa88 commented May 31, 2019

same error on master.

gaiad start
I[2019-05-31|03:56:06.636] Starting ABCI with Tendermint                module=main 
ERROR: Error during handshake: Error on replay: Wrong Block.Header.AppHash.  Expected 91929AA0CA75E18855F6709D64812C0317241209309BEA4B088E4A41D2A876F6, got 840B34D2AEC07BF586EB8B59E2E503FE8D786BE18ACF816A62FF5817ECFCCD48

@posa88
Copy link

posa88 commented May 31, 2019

Thanks @mateuszk87, indeed! When I checkout tag v0.34.3 and call init again, it works 👍

So, is this a bug on latest master branch?

this solution don't work for me.

@kestop
Copy link

kestop commented May 31, 2019

I met the same problem here. And changing seeds does not work

@ssssssu12
Copy link

ssssssu12 commented Jun 4, 2019

I have same error.
git status:

On branch release/v0.34.7
nothing to commit, working directory clean

gaiad version:

0.34.7

seeds:

I[2019-06-04|15:23:50.898] Starting ABCI with Tendermint                module=main
ERROR: Error during handshake: Error on replay: Wrong Block.Header.AppHash.  Expected 91929AA0CA75E18855F6709D64812C0317241209309BEA4B088E4A41D2A876F6, got 840B34D2AEC07BF586EB8B59E2E503FE8D786BE18ACF816A62FF5817ECFCCD48

@yuyasugano
Copy link

The same error happened, would it be possible to look into these issues ?? For my case, v0.34.3 was used and brock syncing got stuck at 482280.

cosmos-sdk: 0.34.3
git commit: 1127446f71fa6aeada1bce2718f7f903cc18e548
vendor hash:
build tags: netgo ledger
go version go1.12.5 linux/amd64

$ gaiacli version --long
cosmos-sdk: 0.34.3
git commit: 1127446f71fa6aeada1bce2718f7f903cc18e548
vendor hash:
build tags: netgo ledger
go version go1.12.5 linux/amd64

I[2019-06-10|20:29:27.321] Starting IndexerService                      module=txindex impl=IndexerService
I[2019-06-10|20:29:27.322] ABCI Handshake App Info                      module=consensus height=482280 hash=91929AA0CA75E18855F6709D64812C0317241209309BEA4B088E4A41D2A876F6 software-version= protocol-version=0
I[2019-06-10|20:29:27.332] ABCI Replay Blocks                           module=consensus appHeight=482280 storeHeight=482281 stateHeight=482280
I[2019-06-10|20:29:27.333] Replay last block using real app             module=consensus
ERROR: Error during handshake: Error on replay: Wrong Block.Header.AppHash.  Expected 91929AA0CA75E18855F6709D64812C0317241209309BEA4B088E4A41D2A876F6, got 840B34D2AEC07BF586EB8B59E2E503FE8D786BE18ACF816A62FF5817ECFCCD48

@alexanderbez
Copy link
Contributor

@yuyasugano you have to use version v0.34.7 in order to successfully sync from genesis.

@wujunchuan
Copy link

The same problem

Network: gaia-13003

Version

$ gaiacli version --long
cosmos-sdk: 0.34.7
git commit: f783cb71e7fe976bc01273ad652529650142139b
vendor hash: f60176672270c09455c01e9d880079ba36130df4f5cd89df58b6701f50b13aad
build tags: netgo ledger
go version go1.12.7 darwin/amd64

genesis.json

$ shasum -a 256 ~/.gaiad/config/genesis.json  #should equals 48519942c69dbda18fd9dfba949bca5591ad67772bff669fc8649d796b9cf3c3

48519942c69dbda18fd9dfba949bca5591ad67772bff669fc8649d796b9cf3c3  /Users/meetone/.gaiad/config/genesis.json

Output

gaiad start
I[2019-07-11|11:07:46.469] Starting ABCI with Tendermint                module=main
ERROR: Error during handshake: Error on replay: Wrong Block.Header.AppHash.  Expected 90C966D919DB087A33212139AAD5D2C0015254A0A6E7686942AF84CB395B38C2, got 40BF5CC64E6C22115A467D212AE871C658653041D0536C6A1A6B829D49564CAB

@askucher
Copy link

askucher commented Feb 4, 2020

Why is this issue closed? is it solved?

@Creamers158
Copy link

Had the same on new akash testnet. Tried everything as far as I know. Reverted to a new server and same steps without a problem.
Any news regarding this issue?

@alexanderbez
Copy link
Contributor

alexanderbez commented May 26, 2020

It's usually always one of the following reasons:

  1. Your node is using the wrong app version (and/or)
    1a. This can be easy to confirm but the app root hash mismatch can occur at any block really.
  2. Your node is using the wrong genesis file (and/or)
    2a. This is easy to confirm -- you'll get an invalid app root hash at block 1
  3. Your node is connected to peer(s) where (1) and/or (2) holds true for them

@Creamers158
Copy link

Creamers158 commented May 26, 2020

  1. app version is confirmed by hash
  2. also confirmed by hash
  3. default peers are given
    Actually, I can replicate by swapping the /data/ folder. One works one doesn't. Same setup.
    Does that ring a bell? Strangely I tried to reset all and init, even completely removed the /data and /config and restarted manually without the daemon. Same problem.
    Moving back to using default path and using root solved my issue. This /data/ folder copied to the desired daemon folder and now it runs like a charm. Just my few cents.

@alexanderbez
Copy link
Contributor

If genesis state is verified to be valid and the application versions are consistent and correct, then it's an issue in your application's state-machine -- most likely some non-determinism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants