Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Private DNS + private topology on AWS: problems regarding the certificate #2032

Closed
igorvpcleao opened this issue Mar 2, 2017 · 17 comments
Closed
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@igorvpcleao
Copy link

igorvpcleao commented Mar 2, 2017

Hi there,

I created a new cluster on AWS:

  1. setting it to use a private zone on Route 53
  2. setting it to belong to a private subnet (--topology private)

As I'm using a private zone on Route 53, my laptop cannot resolve the server name (https://api.cluster.k8s) set on kubecfg.

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: *****
    server: https://api.cluster.k8s
[...]

Looking at Route 53 I noticed that https://api.cluster.k8s is a alias to the load balancer api-cluster-k8s-177050000.us-east-1.elb.amazonaws.com. Therefore I changed kubecfg to use this load balancer endpoint instead of the server name. When I do this change, whenever I run kubectl I get an error regarding the certificate:

Unable to connect to the server: x509: certificate is valid for api.internal.cluster.k8s, api.cluster.k8s, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, not api-cluster-k8s-177050000.us-east-1.elb.amazonaws.com

So I need to use insecure-skip-tls-verify=true to get it working.

Do you guys know how to overcome this issue? Is it possible to tell kops to generate the certificate without constraining domains, for instance?

Thanks in advance!

@thaniyarasu
Copy link

default is suficient , use kops to create a cluster, i did this

export AWS_REGION=us-west-1
export AWS_PROFILE=my-user
export KOPS_STATE_STORE=s3://kops-state-store.example.com
export NAME=dev.example.com

kops create cluster --zones us-west-1a --topology private --bastion=true --networking weave --kubernetes-version 1.5.3 --network-cidr 192.168.0.0/16 --associate-public-ip=false --dns-zone=dev.example.com dev.example.com

then wait for 5 minites to create nodes and LB,etc
after that your API will be accessible at https://api.dev.example.com/

@igorvpcleao
Copy link
Author

igorvpcleao commented Mar 6, 2017

Hey @thaniyarasu,
thanks for replying!

I wish I could use a private zone instead of a public one. Everything works fine on the snippet you provided, but names could be resolved from anywhere on the internet, even if it makes sense only inside a VPC - several of its entries are maped to private ips (nodes within the cluster). I wish I could use a private DNS without having problem with the certificate. It could be done either by inputting my own certificate or specifying I don't want kops to constrain it to any domain, for instance. I have no idea if its possible to do so.

@rdtr
Copy link
Contributor

rdtr commented Mar 15, 2017

If you use OpenVPN or similar technology to access to your k8s private cluster, you might want to check the configuration whether you can make the VPN client delegate DNS resolution using VPN server, not public one like 8.8.8.8.

In my case we use Pritunl as a bastion and we can set name server's address by using it, so I set AWS's private name server address (xxx.xxx.0.2) and so far we have seen no problems using default kubecfg with no public domain provided.

If you prefer ssh tunnel, sshuttle also supports DNS tunneling via ssh tunnel, althouth I haven't tested myself with k8s.

@thaniyarasu
Copy link

@igorvpcleao : names can resolved from public. like https://api.dev.example.com:443 and ssh://bastion.dev.example.com:22.
for me it was fine

@igorvpcleao
Copy link
Author

@rdtr I'll check it out! Thanks a lot!
I think it would be great to address this issue by changing something on kops.

@chrislovecnm
Copy link
Contributor

Is this resolved?

@igorvpcleao
Copy link
Author

@chrislovecnm no.

@chrislovecnm
Copy link
Contributor

@rdtr any ideas on this?

@rdtr
Copy link
Contributor

rdtr commented May 22, 2017

Hmm, only things I now can come up with to use private DNS are,

What an user of kops can do:

  1. If you use VPN software like OpenVPN, you should be able to let your VPN client delegate DNS resolution to the VPN server instead of public resolvers like 8.8.8.8 so that your private domain is resolvable from your laptop. <- this is what I'm doing
  2. Or if you use SSH tunnel through a bastion, there is some tool you can tunnel your DNS resolution as well. e.g.) https://github.com/apenwarr/sshuttle
    <- if you don't mind adding one step to access to your cluster, this may be the easiest solution

IF kops should resolve this:
3. kops needs to add an ELB's domain name when creating SSL certificates and then make it possible to specify it in k8s config.
4. Or we would add a cluster parameter so that an user can pass any domains for SSL certificate. (A problem is the user doesn't have ELB's domain name until a cluster is created though)

P.S.
Not sure everyone does know this, but when creating AWS internal ELB, the DNS name of an internal load balancer is publicly resolvable to the private IP addresses of the nodes.
http://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-internal-load-balancers.html
This means AWS already reveals part of our using private ips. (I assume this is the same for AWS RDS's endpoint as well)
So if you have a control of domain names on your company, I don't think it's totally a bad idea to use publicly resolvable domain name for your AWS private resources, considering AWS already does it 😅

@yogin
Copy link

yogin commented Oct 5, 2017

Stumbled on the same issue today as I was setting up a cluster. There's a small trick I found here at the very bottom, which basically tells you to create records in your public zone for the ELBs created by kops (bastion and api).

This only works if you have a public and private route53 zone with the same domain. And in that case it works fine. But it probably won't work if you have an existing VPC with an internal domain that doesn't match the public one, which is probably the most likely scenario (I use internal domains that are different from the public ones for all my VPCs...).

Ideally I would like to be able to tell kops to either:

  • use a separate public dns zone for public entrypoints (api, bastion), even when I'm using a private zone for all other resources
  • specify an additional list of FQDN to include in the self signed certificates, then I can create a record outside of kops in the zone I want with the name I want.

@ghost
Copy link

ghost commented Nov 24, 2017

As a hacky workaround, I don't think there is anything stopping you from recreating the master cert manually, replacing it in the s3 state store and re-creating the masters. Make sure you keep all the altnames of the kops-generated cert, there are many.

@ghost
Copy link

ghost commented Nov 24, 2017

Additionally PR #2063 added support for defining additional names for the master cert, this is included in version 1.8.0-beta.1 and could be used to solve this problem after creating the cluster.

I'd expect it to be fairly trivial to fix this completely, considering that when using gossip the elb address is added to the certificate. The same thing just needs to be done for private dns as well.

k8s-github-robot pushed a commit that referenced this issue Dec 4, 2017
Automatic merge from submit-queue.

When using private DNS add ELB name to the api certificate

This fixes issue #2032 by using the gossip paths with private dns as well:

* When creating the api server certificate, include the ELB hostname
* When generating kubeconfig, use the ELB hostname as the api server name
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 22, 2018
@ghost
Copy link

ghost commented Feb 22, 2018

/close

@prandelicious
Copy link

You can use something like ZeroTier to securely connect to your VPC without the complexity of a VPN. For DNS, you can probably use dnsmasq to redirect to a different resolver when accessing a host that contains your k8s subdomain.

@ales-blaze
Copy link

Hi there,

I created a new cluster on AWS:

1. setting it to use a private zone on Route 53

2. setting it to belong to a private subnet (`--topology private`)

As I'm using a private zone on Route 53, my laptop cannot resolve the server name (https://api.cluster.k8s) set on kubecfg.

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: *****
    server: https://api.cluster.k8s
[...]

Looking at Route 53 I noticed that https://api.cluster.k8s is a alias to the load balancer api-cluster-k8s-177050000.us-east-1.elb.amazonaws.com. Therefore I changed kubecfg to use this load balancer endpoint instead of the server name. When I do this change, whenever I run kubectl I get an error regarding the certificate:

Unable to connect to the server: x509: certificate is valid for api.internal.cluster.k8s, api.cluster.k8s, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, not api-cluster-k8s-177050000.us-east-1.elb.amazonaws.com

So I need to use insecure-skip-tls-verify=true to get it working.

Do you guys know how to overcome this issue? Is it possible to tell kops to generate the certificate without constraining domains, for instance?

Thanks in advance!
May you explain me a bit about this error actually i am setting up kube's cluster on aws , everything was goig fine , but when i fired kops validate cluster ,
Command :
kops update cluster --yes ,
Result:
Cluster changes have been applied to the cloud.
Changes may require instances to restart: kops rolling-update cluster.

then i fired this command,
Command : kops rolling-update cluster
Result :
Using cluster from kubectl context: dev.ales.net
Unable to reach the kubernetes API.
Use --cloudonly to do a rolling-update without confirming progress with the k8s API
error listing nodes in cluster: Get https://api.dev.ales.net/api/v1/nodes: x509: certificate is valid for ales.net, blog.ales.net, font.ales.net, vertima.ales.net, www.ales.net, not api.dev.ales.net

how can in save myself from this error , save me please ASAP.

@ales-blaze
Copy link

may you provide a video link on how to resolve this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

9 participants