Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet name mismatch when using NodeAuthorization #7172

Closed
jacksontj opened this issue Jun 20, 2019 · 5 comments
Closed

kubelet name mismatch when using NodeAuthorization #7172

jacksontj opened this issue Jun 20, 2019 · 5 comments

Comments

@jacksontj
Copy link
Contributor

jacksontj commented Jun 20, 2019

1. What kops version are you running? The command kops version, will display
this information.

$ kops version
Version 1.12.1 (git-e1c317f9c)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.1", GitCommit:"d4ab47518836c750f9949b9e0d387f20fb92260b", GitTreeState:"clean", BuildDate:"2018-04-12T14:26:04Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.11", GitCommit:"637c7e288581ee40ab4ca210618a89a555b6e7e9", GitTreeState:"clean", BuildDate:"2018-11-26T14:25:46Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

3. What cloud provider are you using?
AWS

4. What commands did you run? What is the simplest way to reproduce this issue?
I enabled NodeAuthorization on an existing cluster (as described in this post). One thing to note is that we have a domain configured in our DHCP options on the VPC (as such the node gets its domain name from DHCP). After doing so I terminated a single node in the cluster to cycle it to the new configuration, and after doing so I see errors in the logs such as:

[kubelet_node_status.go:117] Unable to register node "ip-XXX-XXX-XXX-XXX.us-west-1.compute.internal" with API server: nodes "ip-XXX-XXX-XXX-XXX.us-west-1.compute.internal" is forbidden: node "ip-XXX-XXX-XXX-XXX.INTERNAL_DOMAIN" cannot modify node "ip-XXX-XXX-XXX-XXX.us-west-1.compute.internal"

5. What happened after the commands executed?
At this point the node will continually fail to join the cluster.

6. What did you expect to happen?
I expected the node to be able to join the cluster :)

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?
Upon further investigation I found that kops is doing the NodeAuthorization checks using the name it determines and the kubelet is trying to connect with the name it determines -- unfortunately these don't always match. looking at the k8s cloudprovider docs (https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/#node-name) it clearly states that the kubelet will use the private DNS name of the AWS instance as the name of the Kubernetes Node object.. When we look at kops we can see that the name is actually the local-hostname. In most cases these 2 match, but if you use private dns (using the DHCP options on the VPC) then these names don't match -- and in this situation it is now impossible to make them work.

With that, I believe the "correct" solution is to add a mechanism to kops that determines the name the same way that the kubelet does. So to do this I see 2 options (1) change @aws to this behavior or (2) add another option such as @aws-privatedns which follows this new behavior. Thoughts?

@jacksontj
Copy link
Contributor Author

FYI, I have made a local build of kops to validate this fix -- If I change the hostname evaluation to create the "private DNS name" the kubelet will start.

jacksontj added a commit to jacksontj/kops that referenced this issue Jun 24, 2019
jacksontj added a commit to jacksontj/kops that referenced this issue Jun 24, 2019
jacksontj added a commit to jacksontj/kops that referenced this issue Jun 24, 2019
@jacksontj
Copy link
Contributor Author

I have gone ahead and created 2 PRs

(1) #7185 -- this replaces the @aws hostnameOverride to use the private DNS name as described above
(2) #7184 -- this adds a new hostnameOverride option of @aws-private which uses the private DNS name as described above.

Both aren't needed -- but I imagine #7184 is the one we'll want to go with?

jacksontj added a commit to jacksontj/kops that referenced this issue Jul 1, 2019
@jacksontj
Copy link
Contributor Author

I have updated both PRs to get the privateDNSName from the AWS api-- after talking with their support this seems to be the only reliable way to get the private DNS name.

jacksontj added a commit to jacksontj/kops that referenced this issue Jul 9, 2019
jacksontj added a commit to jacksontj/kops that referenced this issue Jul 9, 2019
jacksontj added a commit to jacksontj/kops that referenced this issue Jul 9, 2019
jacksontj added a commit to jacksontj/kops that referenced this issue Jul 9, 2019
jacksontj added a commit to jacksontj/kops that referenced this issue Jul 9, 2019
jacksontj added a commit to jacksontj/kops that referenced this issue Jul 9, 2019
jacksontj added a commit to jacksontj/kops that referenced this issue Jul 9, 2019
@jacksontj
Copy link
Contributor Author

Personally I'm in favor of replacing the @aws hostnameOverride the cases are as follows:

NodeAuthorizer No-NodeAuthorizer
With DHCP options This doesn't work before this patch, will start working afterwards no change to the nodename (other than the pre-existing node-rename during kubelet startup)
Without DHCP options No change No Change

jacksontj added a commit to jacksontj/kops that referenced this issue Jul 17, 2019
If the cluster's VPC includes DHCP options the local-hostname includes
the DHCP zone instead of the private DNS name from AWS (which is what
k8s uses regardless of flags). This patch simply makes the
hostnameOverride implementation match by using the AWS api to get the
private DNS name

Related to kubernetes#7172
jacksontj added a commit to wish/kops that referenced this issue Jul 17, 2019
If the cluster's VPC includes DHCP options the local-hostname includes
the DHCP zone instead of the private DNS name from AWS (which is what
k8s uses regardless of flags). This patch simply makes the
hostnameOverride implementation match by using the AWS api to get the
private DNS name

Related to kubernetes#7172
justinsb pushed a commit to justinsb/kops that referenced this issue Jul 22, 2019
If the cluster's VPC includes DHCP options the local-hostname includes
the DHCP zone instead of the private DNS name from AWS (which is what
k8s uses regardless of flags). This patch simply makes the
hostnameOverride implementation match by using the AWS api to get the
private DNS name

Related to kubernetes#7172
justinsb pushed a commit to justinsb/kops that referenced this issue Jul 22, 2019
If the cluster's VPC includes DHCP options the local-hostname includes
the DHCP zone instead of the private DNS name from AWS (which is what
k8s uses regardless of flags). This patch simply makes the
hostnameOverride implementation match by using the AWS api to get the
private DNS name

Related to kubernetes#7172
jacksontj added a commit to jacksontj/kops that referenced this issue Jul 22, 2019
If the cluster's VPC includes DHCP options the local-hostname includes
the DHCP zone instead of the private DNS name from AWS (which is what
k8s uses regardless of flags). This patch simply makes the
hostnameOverride implementation match by using the AWS api to get the
private DNS name

Related to kubernetes#7172
@jacksontj
Copy link
Contributor Author

#7185 merged and backported to 1.13+ (1.12 backport pending -- #7308).

jacksontj added a commit to jacksontj/kops that referenced this issue Sep 4, 2019
If the cluster's VPC includes DHCP options the local-hostname includes
the DHCP zone instead of the private DNS name from AWS (which is what
k8s uses regardless of flags). This patch simply makes the
hostnameOverride implementation match by using the AWS api to get the
private DNS name

Related to kubernetes#7172
akursell-wish pushed a commit to wish/kops that referenced this issue Nov 11, 2019
If the cluster's VPC includes DHCP options the local-hostname includes
the DHCP zone instead of the private DNS name from AWS (which is what
k8s uses regardless of flags). This patch simply makes the
hostnameOverride implementation match by using the AWS api to get the
private DNS name

Related to kubernetes#7172
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant