-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GA graduation criteria for NodeLocal DNSCache #1351
Conversation
be3f020
to
bf37839
Compare
- Upgrade to a newer CoreDNS version(1.16.x) in [node-cache](https://github.com/kubernetes/dns/pull/328). | ||
- Ensure that Kubernetes [e2e tests with NodeLocal DNSCache](https://k8s-testgrid.appspot.com/sig-network-gce#gci-gce-kube-dns-nodecache) are passing. | ||
- Scalability tests with NodeLocal DNSCache enabled, verifying the [HA modes](https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/20190424-NodeLocalDNS-beta-proposal.md#design-details) as well as the regular mode. | ||
- Have N clusters(number TBD) running in production with NodeLocal DNSCache enabled. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to get the documentation updated and improved, so that there is a good-quality Task page, before GA.
I'd like to see docs clear enough that someone who has just qualified as CKA can follow that Task page without help, and without needing in to do research / fill in missing details.
At the moment I don't think https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/ meets that criterion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is an add-on, does it also make sense to document how to remove it? (documenting that would help encourage production use)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback! I submitted the change to docs on how to remove the addon - kubernetes/website@master...prameshj:patch-4
Can you help improve the tasks page with suggestions on what to add? Would adding cluster create commands/kubectl commands to check that node-local-dns pods are running and healthy... make it more useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try to help with documenting this add-on (I've already opened kubernetes/website#17944). I'd want to be clear on intent here. A few things:
- What does it mean for this add-on to be GA? I think it means that its optional and that if you enable it, the Kubernetes project believes the add-on is finished.
- Is https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml a manifest or a template that can become a manifest? What kind of template?
- How can the reader verify that their changes worked as intended?
From the proposed changes:
KUBE_ENABLE_NODELOCAL_DNS=true kubetest --up
What are the prerequisites for that to work? (and are those explained to the reader?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prameshj, are you planning to make a pull request based on kubernetes/website@master...prameshj:patch-4 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this KEP, I think it's enough to add an extra line about providing clear / comprehensible documentation aimed at cluster operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubernetes/[email protected]:patch-4
I just did.. I thought I already had, thanks for the reminder!
kubernetes/website#18190
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try to help with documenting this add-on (I've already opened kubernetes/website#17944). I'd want to be clear on intent here. A few things:
- What does it mean for this add-on to be GA? I think it means that its optional and that if you enable it, the Kubernetes project believes the add-on is finished.
- Is https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml a manifest or a template that can become a manifest? What kind of template?
- How can the reader verify that their changes worked as intended?
- IMO, Addon being GA means it is suitable to run in production.
- It is a template, where most of the PILLAR_ variables need to be substituted. Not sure what you are referring to, by kind of template.
- Changes can be verified by a) checking that node-local-dns pods are running b) checking the node-local-dns pods metrics to make sure request count is going up.
From the proposed changes:
KUBE_ENABLE_NODELOCAL_DNS=true kubetest --up
What are the prerequisites for that to work? (and are those explained to the reader?)
Good point. That command assumes you have a working kubernetes repo setup to build and run tests. I can include it in the other PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added documentation to the criteria.
/assign |
@@ -192,7 +192,13 @@ Having the pods query the nodelocal cache introduces a single point of failure. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just had a thought: 169.254.20.10
is link-local. If Node-local DNS cache relies, by default, on Pods being able to route to 169.254.20.10
then that is a dependency on undefined / nonstandard behavior.
According to (my reading of) internet standards, 169.254.0.0/16
is only reachable for a client on the same link.
I recommend addressing that, somehow, before GA. If the decision is to go against the standard, let's record that explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link-local IP is attached to an interface on the host. You are right, it does not exist on each pod. Since all pod traffic does go through the host network and subject to iptables rules configured, this works. I am not aware of any CNI plugins where this would not work. From what i understand, only 127.0.00/8 traffic does not leave the pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A CNI plugin will need to route traffic to 169.254.20.10, but there's no standardized mechanism to make this happen (because 169.254.20.10 is link-local, and link-local addresses aren't routeable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The packets just need to hit the hostNetwork since the link-local IP address exists on an interface in the node. This creates a route in the "local" routing table of the form:
"local 169.254.20.10 dev nodelocaldns proto kernel scope host src 169.254.20.10"
This is sufficient to deliver the packet to anything listening locally on that address. Isn't that true?
Also, if the k8s cluster is using kube-proxy in iptables mode, the pods can use either the link-local IP or the kube-dns service IP to access node-local-dns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local 169.254.20.10 dev nodelocaldns proto kernel scope host src 169.254.20.10
I'm really not keen on setting that in stone. Allocating from the link-local range like this and expecting routing to work, even within only one host, just doesn't seem like an approach that can scale.
It's different when a cloud provider does it at the network layer, because there's no concept of those packets being routed. There's a (virtual) link and it turns out that when the instance boots something else is defending 169.254.169.254
or whatever. That works fine.
Ideally, I can set up a Kubernetes cluster that puts all Pods on the same link-local network, and that's a valid supported configuration (provided I find a CNI that makes it happen). I realize that 169.254.0.0/16
looks like a convenient range to borrow an address from, and I can see why it's often going to be a pragmatic choice.
To be clear: my concern is about making & documenting 169.254.20.10
as a default value. If you have to provide an IPv4 address at deploy time and the documentation suggests picking an address inside 169.254.0.0/16
then that isn't a problem and doesn't set any problematic precedent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, if I understand correctly, we want to tweak the documentation to say:
"Selecting a listen ip for NodeLocal DNSCache from 169.254.20.10/16 is recommended, in our examples we will use 169.254.20.10". Is that correct? I have added this to the website doc PR - kubernetes/website#18716
We can continue the discussion there. From the GA graduation perspective, are the changes here ok to implement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Selecting a listen ip for NodeLocal DNSCache from 169.254.20.10/16 is recommended, in our examples we will use 169.254.20.10
(Retrospectively) I think you've taken an appropriate and pragmatic tack. Thanks.
/assign @bowei @johnbelamaric @thockin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good criteria, thanks for updating.
Do we have/need a long-term plan to keep the CoreDNS version updated?
Do we have an easy-to-run benchmark test that shows the value of cache over no-cache?
Currently, the update is on a need basis. It would be a good idea to update when there is a major bugfix/additional functionality.
|
I have updated the criteria based on the review comments. PTAl @thockin @johnbelamaric @bowei @sftim |
I'm still dubious about taking |
We are not mentioning 169.254.20.10 as a default in this KEP and in the docs, we will be mentioning the 169.254.0.0/16 as a potential space to pick the listen ip from. Do you have suggestions for what else to do here? Also, just to be clear, this feature is not going to be enabled by default, in GA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@sftim I don't think that concern should block this KEP change. It can be taken up again in the actual docs PR if you're still concerned about the language there. /lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnbelamaric, prameshj, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@prameshj thanks for clarifying - #1351 (comment) makes sense to me. |
Modified the existing KEP, as discussed in #1307