Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd-backup to support clusters with enableEtcdTLS configured #5064

Closed
KashifSaadat opened this issue Apr 26, 2018 · 14 comments
Closed

etcd-backup to support clusters with enableEtcdTLS configured #5064

KashifSaadat opened this issue Apr 26, 2018 · 14 comments

Comments

@KashifSaadat
Copy link
Contributor

Using the following etcdClusterSpec (shortened):

  etcdClusters:
  - backups:
      backupStore: s3://<bucket>/kops-backups/etcd/main/
    enableEtcdTLS: true
    etcdMembers:
    - encryptedVolume: true
      instanceGroup: az1-master1
      name: a-1
    - encryptedVolume: true
      instanceGroup: az2-master1
      name: b-1
    - encryptedVolume: true
      instanceGroup: az3-master1
      name: c-1
    image: gcr.io/etcd-development/etcd:v3.3.3
    name: main
    version: 3.3.3

The etcd-backup container produces the following logs:

controller.go:57] unexpected error running backup controller loop: unable to find server version of etcd on [http://127.0.0.1:4001]: client: etcd cluster is unavailable or misconfigured; error #0: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"

Changes would need to be made both to kops and the etcd-manager project to support clusters that have etcd TLS enabled, such as:

  • kops pass the CA Cert path as an arg to etcd-backup (or the full list of arguments that etcd is run with)
  • etcd-backup parses the above and updates calls to etcd based on the arguments provided
@KashifSaadat
Copy link
Contributor Author

CC @justinsb

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 30, 2018
@tavisma
Copy link

tavisma commented Jul 30, 2018

running into the same limitation

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 29, 2018
@rekcah78
Copy link
Contributor

after add etcd-manager to my cluster spec with this command:
kops set cluster 'cluster.spec.etcdClusters[*].manager.image=kopeio/etcd-manager:latest'
kops update cluster failed with this error:
error building tasks: TLS not supported for etcd-manager

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rekcah78
Copy link
Contributor

/reopen

@k8s-ci-robot
Copy link
Contributor

@rekcah78: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sstarcher
Copy link
Contributor

@justinsb can we get this reopened it's certainly a gap for anyone who's running TLS

@KashifSaadat
Copy link
Contributor Author

/reopen
/remove-lifecycle rotten

@k8s-ci-robot
Copy link
Contributor

@KashifSaadat: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Dec 25, 2018
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Dec 25, 2018
@endzyme
Copy link
Contributor

endzyme commented Feb 1, 2019

Quick ping here - I see some movement in the client implementation of backup etcd-manager (last 18 days i think). Is this something that can be revisited with the recent changes to etcd-backup?

If someone can point me in the direction I can also take a stab at it, in my investigations I got as far as buildEtcdBackupManagerContainer and I think some augmentation is needed there to add logic for TLS implementations. (I see a TODO in there).

I haven't yet looked at the source in etcd-manager so see what params are needed.

@nightmareze1
Copy link

Hello Guys,

I have the same problem in my kops cluster when enableEtcdTLS: true
etcdClusters:

  • backups:
    backupStore: s3://kubernetes.donetl.com/backup-etcd/etcd/main/
    etcdMembers:
    • instanceGroup: master-us-east-1a
      name: a
      volumeType: gp2
      volumeSize: 50
    • instanceGroup: master-us-east-1b
      name: b
      volumeType: gp2
      volumeSize: 50
    • instanceGroup: master-us-east-1c
      name: c
      volumeType: gp2
      volumeSize: 50
      enableEtcdTLS: true
      name: main
      version: 3.2.24
  • backups:
    backupStore: s3://kubernetes.donetl.com/backup-etcd/etcd/events/
    etcdMembers:
    • instanceGroup: master-us-east-1a
      name: a
      volumeType: gp2
      volumeSize: 50
    • instanceGroup: master-us-east-1b
      name: b
      volumeType: gp2
      volumeSize: 50
    • instanceGroup: master-us-east-1c
      name: c
      volumeType: gp2
      volumeSize: 50
      enableEtcdTLS: true
      name: events
      version: 3.2.24
      kubectl logs -f etcd-server-ip-10-50-13-140.ec2.internal -n kube-system etcd-backup

etcd-backup agent
W0205 23:27:42.924927 1 controller.go:57] unexpected error running backup controller loop: unable to find server version of etcd on [http://127.0.0.1:4001]: client: etcd cluster is unavailable or misconfigured; error #0: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
W0205 23:28:42.925877 1 controller.go:57] unexpected error running backup controller loop: unable to find server version of etcd on [http://127.0.0.1:4001]: client: etcd cluster is unavailable or misconfigured; error #0: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants