-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto/tls: add TestClientHandshakeContextCancellation back after fixing flake #45106
Comments
Thanks for the follow-up Katie, I should be able to find the time for this, but I'll let you know if not. |
I'm having some problems reproducing this failure on my machine, any thoughts on the environment that triggers this?
|
Here's one of the builder failures, which describes the environment some: https://build.golang.org/log/6e8d66419f4cf08633e84e3c03777ccbf3f41e54 Based on the logs @bcmills provided in the issue, it wasn't affecting darwin architectures, so if you're running those tests from a Mac, that might be why you aren't seeing it. Maybe try a Linux machine, if you have one. |
Thanks for the link, I'll have a look. I'm running it on linux/amd64. Maybe I'll just have to run the whole testsuite. I assumed it might be contention on the |
Change https://golang.org/cl/305250 mentions this issue: |
As mentioned on #45299, I can reproduce this on a Raspberry Pi (plan9-arm) pretty reliably. It goes away if a 50 millisecond delay is inserted between the |
Another clue: on linux-amd64 I can induce a race by inserting a delay into the interrupter goroutine in the
On a heavily loaded system, this sort of delay could occur naturally. |
The implication then is that the test may fail if the interrupter goroutine hasn't had a chance to start before the connection failed? That would explain the flake, but how do we fix it? The test is failing because it's not seeing the |
Is |
As the author of the tests, I think the thing I wanted to assert was really that the HandshakeContext didn't block while waiting for the server/client to respond. Really any error should be fine, and the test will simply block if the cancellation isn't working as expected. This should be a way forward for me to fix the tests, if you think that sounds good @millerresearch? |
I'm not experienced with contexts and cancellation, so I'm not qualified to say whether the test is testing what it should. If deadlock is a possibility it's better to have a test which explicitly checks for that, rather than just leaving the test to block. But that's not what is happening here. |
Thanks for your help @millerresearch, I've updated https://go-review.googlesource.com/c/go/+/305250 with what I hope is a fix for both this issue and #45299, I would appreciate if you could try running the tests in your environment too to confirm it fixes your issue. |
Yes, with your fix I'm no longer able to induce a test failure. Well done! |
Related to #45084 and #32406
We had to remove the test in 2f3db22 since it was flaky, so we should fix it and add it back.
@johanbrandhorst I've gone ahead and assigned this to you since you wrote the original code and removed the test, but let us know if you don't have time to fix it and we can assign it to someone on our team.
/cc @FiloSottile @rolandshoemaker
The text was updated successfully, but these errors were encountered: