Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdctl watch to a etcd in proxy mode with TLS to cluster fails #3894

Closed
markhowells opened this issue Nov 19, 2015 · 7 comments · Fixed by #4254
Closed

etcdctl watch to a etcd in proxy mode with TLS to cluster fails #3894

markhowells opened this issue Nov 19, 2015 · 7 comments · Fixed by #4254

Comments

@markhowells
Copy link

Is this a bug?

I have an etcd proxy running in a docker container like so

/bin/etcd -data-dir=/data -listen-peer-urls=http://0.0.0.0:2380 \
    -listen-client-urls=http://0.0.0.0:2379 -proxy on -discovery=https://discovery.etcd.io/30ae4fe131fce9706708b9a3548724b5 \
    -peer-trusted-ca-file=/etc/ssl/infra/ca.pem -peer-cert-file=/etc/ssl/infra/client.pem -peer-key-file=/etc/ssl/infra/client-key.pem

and the ports 2379 and 2380 are mapped to the docker0 interface on the host.

The proxy comes up in the container

 etcdmain: proxy: listening for client requests on http://0.0.0.0:2379

The main cluster itself is secured using TLS. Now, when I execute a get on a key via the proxy from a container

ETCD_NODE=http://172.17.0.1:2379 etcdctl --no-sync --endpoint ${ETCD_NODE} get /test/key

data is returned as expected. The cluster TLS is terminated at the proxy and the proxy requests the data via the TLS connection. etcdctl requires the --no-sync to suppress direct connections to the cluster.
However, if I execute a watch

etcdctl --no-sync --endpoint ${ETCD_NODE} watch /proxy/sniproxy

then etcdctl returns

Error:  client: etcd cluster is unavailable or misconfigured
error #0: client: endpoint http://172.17.0.1:2379 exceeded header timeout

and my ectd proxy

proxy: client 172.17.0.1:33922 closed request prematurely

Can I set up my proxy to terminate TLS and offer HTTP access within the host or is this a bug?

@bkleef
Copy link

bkleef commented Dec 29, 2015

@markhowells I'm in the same situation any news?

@markhowells
Copy link
Author

@bkleef No progress at all I'm afraid. It's very frustrating for those of us who need to use TLS across Internet connected hosts.All my apps need to be TLS enabled and one I want to use (registrator - https://github.com/gliderlabs/registrator ) isn't...

@philips
Copy link
Contributor

philips commented Jan 21, 2016

Can someone look into this? Maybe @gyuho @heyitsanthony

@gyuho
Copy link
Contributor

gyuho commented Jan 21, 2016

@markhowells Sorry for delay and thanks for the detailed report. I just reproduced the same
behavior as you. I will fix asap.

For reference, here's how I reproduced:

Procfile:

etcd1: bin/etcd --name='etcd1' --listen-client-urls='http://localhost:1179' --advertise-client-urls='http://localhost:1179' --listen-peer-urls='https://localhost:1180' --initial-advertise-peer-urls='https://localhost:1180' --initial-cluster='etcd1=https://localhost:1180,etcd2=https://localhost:1280,etcd3=https://localhost:1380' --initial-cluster-token='etcd-cluster-token' --initial-cluster-state='new' --peer-cert-file='testcerts/cert.pem' --peer-key-file='testcerts/key.pem' --peer-client-cert-auth='true' --peer-trusted-ca-file='testcerts/ca.perm'
etcd2: bin/etcd --name='etcd2' --listen-client-urls='http://localhost:1279' --advertise-client-urls='http://localhost:1279' --listen-peer-urls='https://localhost:1280' --initial-advertise-peer-urls='https://localhost:1280' --initial-cluster='etcd1=https://localhost:1180,etcd2=https://localhost:1280,etcd3=https://localhost:1380' --initial-cluster-token='etcd-cluster-token' --initial-cluster-state='new' --peer-cert-file='testcerts/cert.pem' --peer-key-file='testcerts/key.pem' --peer-client-cert-auth='true' --peer-trusted-ca-file='testcerts/ca.perm'
etcd3: bin/etcd --name='etcd3' --listen-client-urls='http://localhost:1379' --advertise-client-urls='http://localhost:1379' --listen-peer-urls='https://localhost:1380' --initial-advertise-peer-urls='https://localhost:1380' --initial-cluster='etcd1=https://localhost:1180,etcd2=https://localhost:1280,etcd3=https://localhost:1380' --initial-cluster-token='etcd-cluster-token' --initial-cluster-state='new' --peer-cert-file='testcerts/cert.pem' --peer-key-file='testcerts/key.pem' --peer-client-cert-auth='true' --peer-trusted-ca-file='testcerts/ca.perm'
proxy: bin/etcd --name='proxy' --proxy=on --listen-client-urls http://localhost:2379 --initial-cluster 'etcd1=https://localhost:1180,etcd2=https://localhost:1280,etcd3=https://localhost:1380' --peer-cert-file='testcerts/cert.pem' --peer-key-file='testcerts/key.pem' --peer-client-cert-auth='true' --peer-trusted-ca-file='testcerts/ca.perm'

Command:

./etcdctl --debug --endpoint http://localhost:2379 --no-sync set a b
./etcdctl --debug --endpoint http://localhost:2379 --no-sync get a

./etcdctl --debug --endpoint http://localhost:2379 watch a  # no crash


./etcdctl --debug --endpoint http://localhost:2379 --no-sync watch a

# returns
Error:  client: etcd cluster is unavailable or misconfigured
error #0: client: endpoint http://localhost:2379 exceeded header timeout

# error message in server
2016-01-20 20:29:05.152250 I | proxy: client 127.0.0.1:41196 closed request prematurely

@gyuho gyuho self-assigned this Jan 21, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 21, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes etcd-io#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho added a commit to gyuho/etcd that referenced this issue Jan 21, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes etcd-io#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho added a commit to gyuho/etcd that referenced this issue Jan 21, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes etcd-io#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho added a commit to gyuho/etcd that referenced this issue Jan 21, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes etcd-io#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
@xiang90
Copy link
Contributor

xiang90 commented Jan 22, 2016

@gyuho So this is not related to TLS?

@gyuho
Copy link
Contributor

gyuho commented Jan 22, 2016

This is not related to TLS. I see the same behavior without TLS.

etcd1: bin/etcd --name='etcd1' --listen-client-urls='http://localhost:1179' --advertise-client-urls='http://localhost:1179' --listen-peer-urls='http://localhost:1180' --initial-advertise-peer-urls='http://localhost:1180' --initial-cluster='etcd1=http://localhost:1180,etcd2=http://localhost:1280,etcd3=http://localhost:1380' --initial-cluster-token='etcd-cluster-token' --initial-cluster-state='new'
etcd2: bin/etcd --name='etcd2' --listen-client-urls='http://localhost:1279' --advertise-client-urls='http://localhost:1279' --listen-peer-urls='http://localhost:1280' --initial-advertise-peer-urls='http://localhost:1280' --initial-cluster='etcd1=http://localhost:1180,etcd2=http://localhost:1280,etcd3=http://localhost:1380' --initial-cluster-token='etcd-cluster-token' --initial-cluster-state='new'
etcd3: bin/etcd --name='etcd3' --listen-client-urls='http://localhost:1379' --advertise-client-urls='http://localhost:1379' --listen-peer-urls='http://localhost:1380' --initial-advertise-peer-urls='http://localhost:1380' --initial-cluster='etcd1=http://localhost:1180,etcd2=http://localhost:1280,etcd3=http://localhost:1380' --initial-cluster-token='etcd-cluster-token' --initial-cluster-state='new'
proxy: bin/etcd --name='proxy' --proxy=on --listen-client-urls http://localhost:2379 --initial-cluster 'etcd1=http://localhost:1180,etcd2=http://localhost:1280,etcd3=http://localhost:1380'

This Procfile has the same behavior with etcdctl watch --no-sync

gyuho added a commit to gyuho/etcd that referenced this issue Jan 22, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes etcd-io#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho added a commit to gyuho/etcd that referenced this issue Jan 22, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes etcd-io#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho added a commit to gyuho/etcd that referenced this issue Jan 22, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes etcd-io#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho referenced this issue in gyuho/etcd Jan 22, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes coreos#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
@gyuho
Copy link
Contributor

gyuho commented Jan 22, 2016

@markhowells Please try again after updating your client library. The fix #4254 has just been merged to master branch.

I manually tested the failing case and confirmed that it is fixed.
We are also writing end-to-end tests to cover this case.

Please let me know if you still have issues.

Thanks,

gyuho referenced this issue in gyuho/etcd Jan 27, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes coreos#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho referenced this issue in gyuho/etcd Jan 27, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes coreos#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho referenced this issue in gyuho/etcd Jan 27, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes coreos#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho referenced this issue Jan 27, 2016
Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.

This fixes coreos#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 28, 2016
gyuho added a commit to gyuho/etcd that referenced this issue Jan 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

5 participants