-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting "failed to get trusted headers" error #373
Comments
If the line you linked is causing the issue then it is probably because of #8327. Looking at the error it seems to me there might be a connection issue (but the returned errors might just be deceiving). What parameters do you have set for historical entries in genesis? Those determine the length of time the relayer will support interacting with past events. The higher number of historical entries, the longer you can interact with past events |
@colin-axner it is set to 10000 both in musselnet genesis and stargate-final-4 |
@orkunkl could you try checking out the branch |
Tried it with your branch.
This is the error I am getting now which is different than before. I guess your branch fixed
So I have not run relayer from the beginning of the chain. What can I do to make it run now? |
what is the current height of Go into This will create new clients, which will rely on recent blocks. My guess is the clients currently created are older than the historical info in state, which results in this error. |
This command got stuck there for a while. What could be the reason?
|
Did it work? I can only see that the clients were created. My guess is relayer had a hard time querying the chain. I believe it queries the chains for the |
@colin-axner I also confront this |
I solved the issue by changing our full node pruning strategy to |
we should update the error message to indicate that the headers have likely been pruned |
Updated wasmd to global:
timeout: 10s
light-cache-size: 20
chains:
- key: testkey
chain-id: musselnet-2
rpc-addr: https://rpc-ibc.musselnet.cosmwasm.com:443
account-prefix: wasm
gas-adjustment: 1.1
gas-prices: 0.01umayo
trusting-period: 336h
- key: testkey
chain-id: stargate-final
rpc-addr: http://188.34.177.78:26657
account-prefix: cosmos
gas-adjustment: 1.3
gas-prices: 0.01umuon
trusting-period: 336h
paths:
testpath:
src:
chain-id: musselnet-2
client-id:
port-id: transfer
order: ORDERED
version: ics20-1
dst:
chain-id: stargate-final
client-id:
port-id: transfer
order: ORDERED
version: ics20-1
strategy:
type: naive I am getting the same error when linking:
|
This is very odd, I just looked at the sequence of calls and the code seems to be fine. I have two theories
I think 2) is causing the issue, since we see on Short term fix try Long term fix if the above works, we should adjust |
@colin-axner I tried to update light clients but got the same error again. |
@orkunkl thanks for trying. Did the heights in the error message change? I will try to reproduce your error this week and then continue debugging. As a side note, you'll need to update your channel order to be |
Same error message againI had the same error message and solved it by changing
Things I've tried
$ rly l update stargate-final
$ rly l update bifrost-2
$ rly tx link <path> -d -o 3s
|
@kogisin thanks for the report!! This was very useful. It made me realize I forgot to check if a client is expired or frozen when reusing existing clients.
This is a Tendermint light client security requirement. Trusting period must always be less than the unbonding-period Here is what is happening:
Fixes:
|
@akhilkumarpilli if you have bandwidth, these would be some good issues to take on Edit: I'll handle the first fix of checking for expiration |
Sure @colin-axner, will look into this. |
@orkunkl @kogisin I believe the main issue described above should be fixed with the latest commit on master. You will need to delete any existing client identifiers in your .relayer/config/config.yaml. Let me know if it still doesn't work |
I have pruning set to nothing on both chains here but if I leave the relayer running I always get this error in the morning. Re-initializing the light clients does not help. |
What version are you using? |
I've tried various versions, right now I'm on the latest master commit f201839 |
It's failing to pull the TrustedLightBlock in GetLightSignedHeaderAtHeight() in the db. not sure why yet. The block header exists on-chain, I can query it with gaiad query block 19407 |
I[2021-02-25|20:56:26.848] - [gaiad-microtick-testnet]@{32776} - try(1/5) query packet commitment: portID (transfer), channelID (channel-1), sequence (14): packet commitment not found |
Based on the error logs, it looks like the proof constructed for the receive message is at height 32778, but the on-chain client is at height 19407. It cannot verify a proof from the future. It seems an UpdateMsg is not being sent directly before the receive message or it is being sent with the wrong updateMsg. Message index 1 tells me the UpdateMsg is there. So I guess the updateMsg is using the wrong height? Maybe the chain heights got swapped? This wouldn't make sense because the updateMsg, updates the off-chain light client before constructing the updateMsg. It could be related to #425 - cc @akhilkumarpilli I will look deeper into this next week. What commands are you running? Does 19407 height look like the height of the chain which is receiving the transfer? |
I will attempt to create a local setup that can reproduce, and better document the steps. I want to leave the chains in question running in case we need to debug on them.. I'll report back later today (morning here and just getting some coffee) |
#437 should fix this issue |
We had another failure using this release. The relayer had been running for 6-7 hours when the failure occurred. Stopping it at that point and restarting resulted in the "Light block not found" error.
|
Release v0.8.2 appears to have fixed this for us. Haven't noticed any issues and have been hitting it pretty hard for a couple hours. Nice work! |
Amazing! Happy this has been resolved |
relayer version: 1.0.0-rc
I am trying to connect musselnet-2 and stargate-final networks via IBC. this is the error I am getting at the moment:
When I followed up the bug I found out this line is causing the error https://github.com/cosmos/relayer/blob/v1.0.0-rc1/relayer/headers.go#L122
which looks like might be related to this issue cosmos/cosmos-sdk#8341
The text was updated successfully, but these errors were encountered: