Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS]: Support CoreDNS and kube-proxy in EKS add-ons #1159

Closed
tabern opened this issue Dec 1, 2020 · 34 comments
Closed

[EKS]: Support CoreDNS and kube-proxy in EKS add-ons #1159

tabern opened this issue Dec 1, 2020 · 34 comments
Labels
EKS Amazon Elastic Kubernetes Service

Comments

@tabern
Copy link
Contributor

tabern commented Dec 1, 2020

Add support for managing CoreDNS and kube-proxy with EKS add-ons.

Roadmap feature from #252

@sparky005
Copy link

Do we know how these addons will be deployed? Will they be deployed as a deployment or a daemonset? Will we also be able to configure options in Coredns, like number of replicas (if a deployment) or coredns-specific options like autopathing?

@tabern
Copy link
Contributor Author

tabern commented Feb 8, 2021

@sparky005 to start, we are going to be on-boarding the add-ons as they are run by EKS today for every cluster. This means coreDNS as a deployment and kube-proxy as a daemonset. For specific options, you'll have to configure those yourself by editing the API object on the cluster. We'll evaluate specific options like autopathing for these add-ons in the future.

@sparky005
Copy link

Awesome, thanks for the quick answer @tabern!

@dcherman
Copy link

dcherman commented Feb 8, 2021

@tabern As part of this and for kube-proxy specifically, can you consider changing the metricsBindAddress to default to metricsBindAddress: 0.0.0.0:10249 rather than 127.0.0.1:10249? In order for prometheus to monitor kube-proxy, the metrics endpoint needs to be exposed. This is currently the only customization that we do to kube-proxy in our setup, and I'm sure I'm not the only one.

@gitfool
Copy link

gitfool commented Feb 8, 2021

Same here, plus being able to specify the kube-proxy and CoreDNS versions, since they're not automatically updated when updating an existing EKS cluster, and it would be nice to avoid manual steps or otherwise trying to automate with our devops tools.

@tabern
Copy link
Contributor Author

tabern commented May 20, 2021

We're happy to announce that CoreDNS and kube-proxy are now supported as part of EKS add-ons. You can now manage all three core networking add-ons (coredns, kube-proxy, vpc cni) using the EKS APIs, including provisioning and updates.

More information

!! note !! add-ons support for CoreDNS and kube-proxy is only available on the latest platform version of EKS (for each support K8s version 1.18 or higher).

If your cluster is not already on the latest platform version, you can update to the next Kubernetes version or create a new cluster to use EKS add-ons for CoreDNS and kube-proxy. Alternatively, all EKS clusters will update in the coming quarter without requiring action to the latest platform versions.

@tabern tabern closed this as completed May 20, 2021
@marcincuber
Copy link

marcincuber commented May 24, 2021

When I tried deploying kube-proxy as a managed add-on through Terraform, I had the following error:

Error: unexpected EKS add-on (eks-test-eu:kube-proxy) state returned during creation: creation not successful (CREATE_FAILED): Errors:
│ Error 1: Code: AccessDenied / Message: clusterrolebindings.rbac.authorization.k8s.io "eks:kube-proxy" is forbidden: user "eks:addon-manager" (groups=["system:authenticated"]) is attempting to grant RBAC permissions not currently held:
│ {APIGroups:["discovery.k8s.io"], Resources:["endpointslices"], Verbs:["get"]}

If you hit that, make sure to update eks:addon-manager clusterrole which should include following block of permissions:

apiGroups:
  - discovery.k8s.io
  resources:
  - endpointslices
  verbs:
  - list
  - watch
  - get

Then run terraform apply and the addon will deploy okey.

It would be great if someone from AWS provides info on who is responsible for eks:addon-manager cluster role and whether my updates to it won't be overwritten by some reconcile process built into EKS control plane.

The same came up in AWS Ireland (eu-west-1) and AWS China Beijing (cn-north-1).

@cheeseandcereal
Copy link
Member

cheeseandcereal commented May 24, 2021

Hi @marcincuber, it is possible that the eks:addon-manager (or any other eks:* role for that matter) will be occasionally overwritten. It's not reconciled on any sort of fixed cadence right now.

Once the addon is installed successfully, it may not even need these permissions again unless something gets deleted and it has to re-create the ClusterRoleBinding. Another option for this particular case is removing get permissions from endpointslices in the system:node-proxier role since they are not directly needed by kube-proxy (only list and watch are needed as seen by the current bootstrap code) and are not given to the role by default.

In the future we expect the permissions of eks:addon-manager to expand so these sorts of issues eventually won't be a problem.

@marcincuber
Copy link

@cheeseandcereal Do you know whether there is a publicly available git repository where eks:addon-manager clusterrole is being configured? I would be more than happy to create a PR to add missing permissions when necessary.

@cheeseandcereal
Copy link
Member

There isn't any public repository with that info at the moment, unfortunately. The way that role gets updated may change in the future.

@vkudryk
Copy link

vkudryk commented May 27, 2021

EKS addon manager persistently overrides custom Corefile. Is that ok?

@shixuyue
Copy link

shixuyue commented May 30, 2021

EKS addon manager persistently overrides custom Corefile. Is that ok?

I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?

@vkudryk
Copy link

vkudryk commented Jun 4, 2021

EKS addon manager persistently overrides custom Corefile. Is that ok?

I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?

We have DNS-servers our on on-premise infra and about 10 internal zones strictly needed to be forwarded to them.
I had to revert installation of CoreDNS as addon due to this issue. Waiting for availability to use/include own Corefile.

Upd: I see an issue related to this problem already exists #1275

@shixuyue
Copy link

shixuyue commented Jun 7, 2021

EKS addon manager persistently overrides custom Corefile. Is that ok?

I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?

We have DNS-servers our on on-premise infra and about 10 internal zones strictly needed to be forwarded to them.
I had to revert installation of CoreDNS as addon due to this issue. Waiting for availability to use/include own Corefile.

Upd: I see an issue related to this problem already exists #1275

I have a hacky workaround: you can edit eks:addon-manager role under kube-system namespace to remove its permission for update and patch to configmap.

@vkudryk
Copy link

vkudryk commented Jun 14, 2021

EKS addon manager persistently overrides custom Corefile. Is that ok?

I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?

We have DNS-servers our on on-premise infra and about 10 internal zones strictly needed to be forwarded to them.
I had to revert installation of CoreDNS as addon due to this issue. Waiting for availability to use/include own Corefile.
Upd: I see an issue related to this problem already exists #1275

I have a hacky workaround: you can edit eks:addon-manager role under kube-system namespace to remove its permission for update and patch to configmap.

Seems to be dirty solution. I'd rather not recommend to do this, as it may have a negative impact on kube-proxy and vpc-cni addons, which are working OK.

@itssimon
Copy link

I have a hacky workaround: you can edit eks:addon-manager role under kube-system namespace to remove its permission for update and patch to configmap.

This seems to be the only workaround currently and saved me from desaster.

Seems to be dirty solution. I'd rather not recommend to do this, as it may have a negative impact on kube-proxy and vpc-cni addons, which are working OK.

By only removing the update/patch permission for the coredns ConfigMap the other add-ons should not be affected.

@vishwas2f4u
Copy link

Is there a solution for this yet?
I'm trying to add hosts entries on the coredns configmap and it constantly gets overwritten!
Is there a way to get past this without editing the permission for eks:addon-manager role?

@mdrobny
Copy link

mdrobny commented Aug 9, 2021

I am facing the same problem as @vishwas2f4u or @dcherman
I would like to adjust kube-proxy-config ConfigMap to change some conntrack settings but my changes are constantly overwritten by EKS addon manager

@nuriel77
Copy link

Hi, we'd like to add zone antiaffinity to the coredns affinity rules, e.g.:

        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: k8s-app
                  operator: In
                  values:
                  - kube-dns
              topologyKey: kubernetes.io/hostname
            weight: 100
          - podAffinityTerm:     # to add this rule to the list
              labelSelector:
                matchExpressions:
                - key: k8s-app
                  operator: In
                  values:
                  - kube-dns
              topologyKey: topology.kubernetes.io/zone

As I can see affinity is managed by eks:

          f:spec:
            f:affinity:
              f:nodeAffinity:
                f:requiredDuringSchedulingIgnoredDuringExecution:
                  f:nodeSelectorTerms: {}
              f:podAntiAffinity:
                f:preferredDuringSchedulingIgnoredDuringExecution: {}

It is important for us to make sure we always have a working coredns instance in each AZ, we don't want to end up with all the pods on the same AZ.

@DawidYerginyan
Copy link

EKS addon manager persistently overrides custom Corefile. Is that ok?

I have ran into a situation that I have to patch coredns configmap to have consul forward. But EKS manager keeps reverting it back to the default, any suggestions or work arounds? I know Azure can take coredns-custom cm for this, does eks has something similar?

We have DNS-servers our on on-premise infra and about 10 internal zones strictly needed to be forwarded to them.
I had to revert installation of CoreDNS as addon due to this issue. Waiting for availability to use/include own Corefile.
Upd: I see an issue related to this problem already exists #1275

I have a hacky workaround: you can edit eks:addon-manager role under kube-system namespace to remove its permission for update and patch to configmap.

Seems to be dirty solution. I'd rather not recommend to do this, as it may have a negative impact on kube-proxy and vpc-cni addons, which are working OK.

Does this work properly? When I look at the eks:addon-manager Role, seems like f:rules is also managed server-side 🤔

@itssimon
Copy link

itssimon commented Sep 7, 2021

Unfortunately you're right, it doesn't work reliably. The eks:addon-manager role is occasionally updated by EKS and any manual changes made to it are overwritten.

Anyone coming across this should upvote issue #1275 to help prioritise a fix for this.

@ctellechea2001
Copy link

@gitfool @dcherman @itssimon @nuriel77 @mdrobny Do you know any fix for that? Any changes in coredns configmaps is overwritten by eks coredns addon. Thanks!!!

@yunfeilu-dev
Copy link

Curious if any change in configMap will be overriden by addon-manager, then how this post is done: https://aws.amazon.com/premiumsupport/knowledge-center/eks-conditional-forwarder-coredns/?nc1=h_ls

@stevehipwell
Copy link

@yunfeilu-dev there is a note to only use this on self managed CoreDNS, but I suspect that if you get the field management configured correctly you could do this with a managed addon as long as you're not blocking fields it needs to change.

@yunfeilu-dev
Copy link

uhhhhh, thanks for answering my dumb question.

@Cajga
Copy link

Cajga commented Feb 21, 2022

We use IaC (terraform) to manage create and bootstrap our clusters running on Outpost. We would like to have the possibility to add a forwarder to coredns config without the CoreDNS addon to overwrite it.

@GeneMyslinsky
Copy link

GeneMyslinsky commented Mar 22, 2022

I wasn't smart enough to get the managed fields working with eks+coredns nor was I dedicated enough to get the official helm chart to work.
Instead what got me rolling past my issue was:
eksctl delete addon --cluster us --name coredns --preserve
This will delete the addon from eks addon manager, but keep all the manifests that it generated. Apply your new cm/deploy changes and restart the pods as usual

@philomory
Copy link

I think Amazon maybe should have released a custom Helm chart for CoreDNS and kube-proxy similar to the one they released for VPC-CNI; that way, people who just want AWS to manage everything for them could use the add-ons, but people who need to be able to customize e.g. their Corefile could do so by deploying via helm (without having to the use the upstream coredns chart, which doesn't line up with what Amazon deploys and cannot really be coerced to do so). Right now we're already using the VPC-CNI chart instead of the add-on so we can reliably customize the configuration, and we'd happily do the same for coredns if there was an available chart that actually worked.

@voidlily
Copy link

voidlily commented May 4, 2022

I've been having good luck using the official coredns helm chart with the eks-optimized images in ECR.

@stevehipwell
Copy link

@voidlily could you share the values you're using?

@voidlily
Copy link

voidlily commented May 5, 2022

A fairly standard helm config should work, with the caveat that we want existing configuration in a fresh cluster to still work the same as when aws's coredns deployment/service was installed.

autoscaler:
  enabled: true

image:
  # https://docs.aws.amazon.com/eks/latest/userguide/managing-coredns.html
  repository: "602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/coredns"
  tag: "v1.8.7-eksbuild.1"

service:
  name: "kube-dns"
  # default version in a fresh eks cluster hardcoded to x.x.x.10, reuse the ip to avoid reconfiguration
  clusterIP: "10.10.0.10"

Then before running the helm chart, I run this shell script to remove aws's installation and annotate the existing kube-dns service so helm can manage it

kubectl --namespace kube-system delete deployment coredns
kubectl --namespace kube-system annotate --overwrite service kube.dns meta.helm.sh/release-name=coredns
kubectl --namespace kube-system annotate --overwrite service kube-dns meta.helm.sh/release-namespace=kube-system
kubectl --namespace kube-system label --overwrite service kube-dns app.kubernetes.io/managed-by=Helm

I have a similar helm import script I use when dealing with vpc-cni's helm chart to avoid connectivity interruptions

NAMESPACE="kube-system"
for kind in daemonSet clusterRole clusterRoleBinding serviceAccount; do
  if kubectl --namespace $NAMESPACE get --ignore-not-found $kind/aws-node | grep aws-node; then
    echo "setting annotations and labels on $kind/aws-node"
    kubectl --namespace $NAMESPACE annotate --overwrite $kind aws-node meta.helm.sh/release-name=aws-vpc-cni
    kubectl --namespace $NAMESPACE annotate --overwrite $kind aws-node meta.helm.sh/release-namespace=kube-system
    kubectl --namespace $NAMESPACE label --overwrite $kind aws-node app.kubernetes.io/managed-by=Helm
  else
    echo "skipping $kind/aws-node as it does not exist in $NAMESPACE"
  fi
done

@philomory
Copy link

@voidlily You don't run into problems with the fact that the ClusterRole used by your new deployment from the chart has fewer permissions that the one normally used by the EKS builds of CoreDNS? EKS's system:coredns ClusterRole grants get on nodes, but the Chart one doesn't give that permissions.

Mind, it's unclear to whether the EKS role grants that permission because they've customized coredns such that it needs that permission, or if it is just something left over from prior versions, or what.

@voidlily
Copy link

voidlily commented May 5, 2022

I can't say I've run into issues with the clusterrole differences personally, no

@philomory
Copy link

I ended up digging into our EKS audit logs in cloudwatch and I can't find a single occurrence where system:serviceaccount:kube-system:coredns has been used to access any resource type other than endpoints, namespaces, and services (presuambly pods would have shown up if I had pod verification turned on).

Then I got even more curious and dug into the coredns source code. Here is the only place I could find where CoreDNS tries to query node information:

// GetNodeByName return the node by name. If nothing is found an error is
// returned. This query causes a roundtrip to the k8s API server, so use
// sparingly. Currently this is only used for Federation.
func (dns *dnsControl) GetNodeByName(ctx context.Context, name string) (*api.Node, error) {
	v1node, err := dns.client.CoreV1().Nodes().Get(ctx, name, meta.GetOptions{})
	return v1node, err
}

So I think it's safe to lose. Kubernetes federation isn't (as far as I know) a GA feature, and if it ever becomes one presumably the official CoreDNS chart will add those permissions back in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Amazon Elastic Kubernetes Service
Projects
None yet
Development

No branches or pull requests