-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Streaming][bugfix] handle TLS signalisation when TLS is disabled on client side #9494
[Streaming][bugfix] handle TLS signalisation when TLS is disabled on client side #9494
Conversation
…client side This will fix hashicorp#9474
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! I was hoping to have a look at the bug report soon. This definitely seems like the right place for the fix.
I'm going to look into this fix some more tomorrow and see why the test is failing.
It seems a bit strange me to that the wrapper could return the same connection unmodified. I'm wondering if we don't need to send the RPCTLS byte at all from here, because that wrapping happens elsewhere, or if we should be checking something other than server.UseTLS
or if there is something else I don't understand here.
Previous patch was using conn to write magic number, but since TLS Connection had been started, if corrupted the stream when TLS was active. To make it work, changed `TLSWrapper` to be another indirection using the datacenter as paramter, thus, the TLS knowledge about TLS being enable is known in advance. This involves some refactoring, but we are sure the condtions to decide whether TLS is enabled are identical.
8de406a
to
deeb577
Compare
Thank you for tracking down the bug! @kyhavlov and I were looking at this, and I think it should be possible to fix without changing the signature of As you found, the source of the problem is Line 769 in 19baf4b
Which gets here: Lines 558 to 561 in 19baf4b
I think we can accomplish that by changing We took a quick look at the two other callers of
If we make that change I guess we also need to handle the case where |
@dnephin Thank you for looking. Well, to me the big problem is that I am also quite afraid of activating silently TLS for all Consul users in such case, because for now, having What do you think? |
I agree we shouldn't change the existing behaviour for normal RPC, but I think we can make the default behaviour for gRPC requests do the right thing. If someone doesn't want TLS at all, they omit the |
@dnephin what would you recommend then? Because it would be 2 different code paths, so, I would have to duplicate Still I don't really get the purpose of I am a bit skeptical: having different behaviour for GRPC OverRPC and other RPC is weird for the same parameters. TLS is hard enough/scary for admins, I would not complexity furthermore the behaviour if possible. I see and applause the point of using TLS as much as possible, but I would rather change this behaviour for all RPC, not only GRPC over RPC in the next major release in such case. |
Ya, I think you are probably right. We should consider doing that for 1.10 but we don't need to do it in this PR. Another option for fixing this would then be to change the |
You decide, tell me what to do, I'll do the change quickly because it blocks our release of 1.9.x in production (we try to push changes in prod only when merged upstream). I took the approach because :
|
…client side Tnis is an alternative to hashicorp#9494
[Streaming][bugfix] handle TLS signalisation when TLS is disabled on client side (alternative to #9494)
replaced by #9512 |
[Streaming][bugfix] handle TLS signalisation when TLS is disabled on client side (alternative to #9494)
…client side Tnis is an alternative to hashicorp#9494
…client side Tnis is an alternative to hashicorp#9494
This will fix #9474
wrapper(server.Datacenter, conn)
could return the same connection. In such case, Consul client were incorrectly marking connection as RPCTLS while the connection was not encrypted.This cause messages like:
The bug can be highlighted very easily by running Consul in dev mode with this config:
Then, the bug is triggered with:
After this fix, such configuration works well.