-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Payments wait forever to be conducted #2779
Comments
@Dominik1999 the scenario runner logs are not sufficient to figure out what happened. Could you provide the logs from the individual nodes and their reveal timeout? (the timeout of the request depends on the reveal timeout) Edit: node |
@hackaugusto sorry this took me a while ... here is the log file of node 10 which started a payment that never succeeded. At least this is how I would interpret the scenario player log. |
@Dominik1999 could you get the logs for node |
@hackaugusto
I can run the same scenario several times and get this error at different transfers in the scenario. @czepluch faces the same error. |
Yes. I just face the problem in an even simpler scenario where only one transfer is being made without any mediators. Ulo tried to run a similar scenario to mine at it works for him. Note that the first time I ran the scenario it succeeded, but not since then even without reusing the same token network. Edit: I did kill the SP while running the scenario the second time, so I maybe this resulted in some bad state in one of the nodes. Log from sending node: https://gist.github.com/czepluch/7e337ea81c0385264d9910f9ebb224db |
I just discussed this with dominik in a call. The The log message to look for is logged by the |
So I had a small look at @czepluch 's 2 logs. Who is getting stuck? Receiver or sender? In those logs the sender ( The sender retries many times but the target logs do not show any received transfers. The target only seems to have an active healthcheck with the sender, early in the game:
And then some logs showing he has acknowledged the channel opening.
|
Yes this is definitely weird. Wild guess: maybe a matrix problem? |
@LefterisJP tried to make a transfer to the "receiving node" from another node that has a path while above problem occurs and that transfer also just hangs. It's a new scenario so here are the logs for that one: If I stop the scenario when it hangs and start it again with a new token, it's does just fine until it reaches the same transfer again and then it hangs again. Edit: The scenario in my case is this one: https://gist.github.com/czepluch/7f2e6c92f3892d2a37a29a8864e2de69 |
More testing yesterday showed that once a transfer between a particular set of nodes has entered this hanging state it doesn't recover even after a node restart and deploying a new token network. The parameters determining the hanging state seem to be Nodes involved and Eth chain used. That makes me even more suspicious that the problem may be in the matrix transport, since those three parameters also control room assignment for node-to-node communication. |
Thank you @ulope for your insight. Is there any way to debug this by using a dedicated matrix server (our testing servers), recreate the problem in its simplest form and watch what happens in Matrix? |
@LefterisJP You can already observe what's happening. Simply login in (or register an account, if you do that enter no email address in the registration form, it's not supported on our setup) to the matrix server used by the nodes. In this case it was transport01. In the corresponding room you can see that the initiating node is sending messages but no reply exists from the target. Unfortunately, as mentioned before I'm not working today, but will look more into this over the weekend. |
@czepluch What is your even simpler scenario you are referring to in this comment? #2779 (comment) |
@christianbrb As I explain in the comment it was just a simple scenario with one node sending a transfer to another one with no mediators. |
Here is a simpler scenario that also causes this behavior: https://gist.github.com/ulope/9113c794e431e04f460f65e9695cfede (Please remember to change the raiden executable path to fit your local environment) During testing this I also again came across #2838 and discovered a new issue #2879 |
Talking with @konradkonrad and @hackaugusto , I can see two ways for the receiver not be receiving sender messages:
I'll make a PR to fix 2., and we can test. |
condition: - Client A invites - The invite triggers _handle_invite in Client B's transport - Client A starts sending messages to Client B - Messages are lost, as the invite was not processed yet The race condition will be fixed in another PR. Appeared during raiden-network#3124, related raiden-network#2779, raiden-network#3123.
condition: - Client A invites - The invite triggers _handle_invite in Client B's transport - Client A starts sending messages to Client B - Messages are lost, as the invite was not processed yet The race condition will be fixed in another PR. Appeared during raiden-network#3124, related raiden-network#2779, raiden-network#3123.
condition: - Client A invites - The invite triggers _handle_invite in Client B's transport - Client A starts sending messages to Client B - Messages are lost, as the invite was not processed yet The race condition will be fixed in another PR. Appeared during raiden-network#3124, related raiden-network#2779, raiden-network#3123.
condition: - Client A invites - The invite triggers _handle_invite in Client B's transport - Client A starts sending messages to Client B - Messages are lost, as the invite was not processed yet The race condition will be fixed in another PR. Appeared during raiden-network#3124, related raiden-network#2779, raiden-network#3123.
condition: - Client A invites - The invite triggers _handle_invite in Client B's transport - Client A starts sending messages to Client B - Messages are lost, as the invite was not processed yet The race condition will be fixed in another PR. Appeared during raiden-network#3124, related raiden-network#2779, raiden-network#3123.
condition: - Client A invites - The invite triggers _handle_invite in Client B's transport - Client A starts sending messages to Client B - Messages are lost, as the invite was not processed yet The race condition will be fixed in another PR. Appeared during raiden-network#3124, related raiden-network#2779, raiden-network#3123.
condition: - Client A invites - The invite triggers _handle_invite in Client B's transport - Client A starts sending messages to Client B - Messages are lost, as the invite was not processed yet The race condition will be fixed in another PR. Appeared during raiden-network#3124, related raiden-network#2779, raiden-network#3123.
condition: - Client A invites - The invite triggers _handle_invite in Client B's transport - Client A starts sending messages to Client B - Messages are lost, as the invite was not processed yet The race condition will be fixed in another PR. Appeared during raiden-network#3124, related raiden-network#2779, raiden-network#3123.
Potential Bug / Possible Improvement for the API
Scenario #2778 ran for 5 hours without getting any response code on started payments. Scenario was manually aborted after 5 hours.
See https://gist.github.com/Dominik1999/ceee2ec4ede7c4f815664610d078998a
I would expect the API to respond with code 408 Requested Timeout.
Reproduce
Run the following simple scenario.
The text was updated successfully, but these errors were encountered: