-
Notifications
You must be signed in to change notification settings - Fork 42
tmkms 0.6.3 hang #352
Comments
Related to informalsystems/tendermint-rs#2 perhaps? |
@mdyring that's the most likely explanation. After some discussion on |
That would be greatly appreciated. :-) While I love kms, I am worried this missing piece makes it fragile. |
I experienced the same situation today. |
Note that Rust's Upstream in https://github.com/interchainio/tendermint-rs/ we've decided to move the Secret Connection implementation used by KMS back into this repository, which should make it much easier to start playing with those features. |
First steps towards a proper async timeout implementation on #365 |
Looking forward to that async implementation as we've just experience this issue again today. https://twitter.com/validator_net/status/1192247910035083264?s=20 What would be the best way to ensure this is the root cause? I am not seeing any networking related events in our monitoring to explain why a TCP connection between the KMS and validator would suddenly die, if anything a retransmit should fix it. Would a core dump be useful to get some stack traces of the hung state next time? If yes, let me know if interested and what would be best way to accomplish this on Rust. |
@mdyring you can try the latest master and see if it helps. Separately I've been meaning to cut a prerelease of what's on master before we start async work as there are a number of unrelated changes that it'd be nice to be sure did not cause regressions before we start async work. (BTW: stable async/await support in Rust shipped a few days ago, so we're ready to go on that front) |
I believe this is a dup of #310. Please reopen if you still experience these problems with tmkms v0.7.0. |
On 0.6.3 we just experienced a "hung" tmkms process.
It was solved by a
systemd restart tmkms
, which tmkms responded to immediately.Both irishub and cosmoshub-2 validators were affected at the same time.
https://twitter.com/validator_net/status/1173769574661201921?s=20
Any ideas appreciated.
Log from tmkms side:
Validator side:
The text was updated successfully, but these errors were encountered: