Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configmaps "aws-auth" already exists #852

Closed
1 task done
moh-abk opened this issue Apr 28, 2020 · 52 comments
Closed
1 task done

configmaps "aws-auth" already exists #852

moh-abk opened this issue Apr 28, 2020 · 52 comments

Comments

@moh-abk
Copy link

moh-abk commented Apr 28, 2020

I have issues

I'm submitting a...

  • bug report

What is the current behavior?

When deploying a cluster and using only managed node_groups I believe because they're managed, AWS creates the aws-auth automatically and joins them to the cluster.

This means that terraform throws the error configmaps "aws-auth" already exists. So the kubernetes_config_map should update and not throw an error saying the configmap already exists

If this is a bug, how to reproduce? Please include a code sample if relevant.

Deploy the cluster using managed node_groups.

What's the expected behavior?

aws-auth config map should not already exist. kubernetes_config_map should apply/update-in-place aws-auth

Are you able to fix this problem and submit a PR? Link here if you have already.

N/A

Environment details

  • Affected module version: v11.1.0
  • OS: Linux
  • Terraform version: 0.12.24
@dpiddockcmp
Copy link
Contributor

Which version of the module are you using?

This shouldn't be possible if you have manage_aws_auth = true (the default) since v8.0.0.

What is your full module block that produces the problem?

@moh-abk
Copy link
Author

moh-abk commented Apr 28, 2020

module version: v11.1.0

Curious how manage_aws_auth = true is telling AWS not to automatically create the aws-auth config map?

You can see in the guide here - https://docs.aws.amazon.com/eks/latest/userguide/launch-workers.html

In the Self-managed nodes tab - AWS says you should kubectl apply -f aws-auth-cm.yaml
In the Amazon EKS managed node groups tab - there's no mention of it so it's automatically done

@moh-abk
Copy link
Author

moh-abk commented Apr 28, 2020

To reproduce this;

  • Create eks cluster with self-managed nodes and manage_aws_auth = false
    kubectl --kubeconfig=kubeconfig_* -n kube-system get cm aws-auth -o yaml
Error from server (NotFound): configmaps "aws-auth" not found
  • Create eks cluster with no self-managed nodes only managed node_groups and manage_aws_auth = false
    kubectl --kubeconfig=kubeconfig_* -n kube-system get cm aws-auth -o yaml
apiVersion: v1
data:
  mapRoles: |
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:aws:iam::<REDACTED>:role/<REDACTED>
      username: system:node:{{EC2PrivateDNSName}}
kind: ConfigMap
metadata:
  creationTimestamp: "2020-04-28T11:32:48Z"
  name: aws-auth
  namespace: kube-system
  resourceVersion: "2115"
  selfLink: /api/v1/namespaces/kube-system/configmaps/aws-auth
  uid: 567137c9-a5d0-4f6e-b3ae-6876e796f42b

@dpiddockcmp
Copy link
Contributor

Yes, AWS does add to the aws-auth config map when creating managed nodes. However there is dependency management in the module to ensure that the aws-auth configmap is applied by terraform in new clusters before attempting to create the managed node groups.

It happens via the null_data_source in node_groups.tf

@JamesDowning
Copy link

Using module version v12.0.0 and worker_groups_launch_template also produces this error when trying to create a new cluster.

@pit
Copy link

pit commented May 14, 2020

Same error using module version v11.1.0 + manage_aws_auth = true

Error: configmaps "aws-auth" already exists

  on .terraform/modules/eks/aws_auth.tf line 62, in resource "kubernetes_config_map" "aws_auth":
  62: resource "kubernetes_config_map" "aws_auth" {

@pit
Copy link

pit commented May 14, 2020

my fault, missed provider "kubernetes" resource in terraform file (see usage example https://github.com/terraform-aws-modules/terraform-aws-eks#usage-example)

@JamesDowning
Copy link

my fault, missed provider "kubernetes" resource in terraform file (see usage example https://github.com/terraform-aws-modules/terraform-aws-eks#usage-example)

Same mistake, apologies I didn't realise the provider block was required.

@darrenfurr
Copy link

I'm experiencing the same issue with the latest module: 12.0.0

configmaps "aws-auth" already exists�[0m
2020-05-26T19:15:50.0159220Z 
2020-05-26T19:15:50.0160008Z �[0m  on .terraform/modules/kubernetes/terraform-aws-eks-12.0.0/aws_auth.tf line 62, in resource "kubernetes_config_map" "aws_auth":
2020-05-26T19:15:50.0160619Z   62: resource "kubernetes_config_map" "aws_auth" �[4m{�[0m
provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
  version                = "~> 1.11"
}

I am using the provider already & have manage_aws_auth = true

What am I doing wrong here?

@darrenfurr
Copy link

darrenfurr commented May 26, 2020

FYI - Just a follow-up on this...I am not using managed nodes & was attempting to add a fargate profile to the same EKS cluster, so that is what was causing my error.

Once I removed the fargate profile & IAM role everything worked with the latest version (12.0.0)

@ibratoev
Copy link

Same issue here with using the aws_eks_fargate_profile.

@ibratoev
Copy link

Looking at the codebase, if I add a dependency in the aws_eks_fargate_profile to the module output config_map_aws_auth - it should work...

@kushalgangan
Copy link

Using v12.0.0 with worker_groups_launch_template and getting below error

Error: configmaps "aws-auth" already exists

@darrenfurr
Copy link

@ibratoev - did you just add a depends_on=aws_eks_fargate_profile.this to the module?

Or is there a dependencies variable for the module OR is this a PR?

@thpang
Copy link

thpang commented Jun 10, 2020

Having the same issue with v12.1.0

Error: configmaps "aws-auth" already exists

  on .terraform/modules/eks/terraform-aws-eks-12.1.0/aws_auth.tf line 64, in resource "kubernetes_config_map" "aws_auth":
  64: resource "kubernetes_config_map" "aws_auth" {

This is on a new cluster being created.

@privomark
Copy link

Just a note from my experience (even though I'm not using the module) if you place the aws_auth resources between the building of the cluster and of the node groups, it works.

My dependency chain "Create Cluster --> Create Auth --> Create Node Groups"

@thpang
Copy link

thpang commented Jun 24, 2020

I have found that if one already has a kubernetes cluster and the ~/.kube/config file pointing to that cluster the aws-auth is setup there and not your AWS EKS cluster. The code does not even check to see if it's the right cluster it simply assumes that the current kube config is correct. Which is strange as the cluster is being created.

My solution was to remove the unwanted aws-auth entity from my other cluster, remove the kube config file temporarily while creating the AWS EKS cluster and all seemed fine.

Seems like one of those use cases where someone did not think about some managing several clusters and already having the kube config file actually pointing to a running system.

@spikewang
Copy link

One more note regarding the same error. We pinned our EKS provider to an older version v7.0.0. Once we upgraded to v12.1.0. Same thing happened to us... since our clusters already existed there for a while anyways...

@xoanmi
Copy link

xoanmi commented Jul 1, 2020

Same error solved configuring the kubernetes provider pointing to my cluster:

data "aws_eks_cluster" "cluster" {
  name = module.eks.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_id
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
  version                = "~> 1.11"
}

Basic example reference: https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/basic/main.tf

@venomwaqar
Copy link

venomwaqar commented Jul 17, 2020

Whats the correct answer for this? I don't understand.
I have followed what @xoanmi has said, still I am having the same issue.

Terraform v0.12.28
+ provider.aws v2.70.0
+ provider.kubernetes v1.11.3
+ provider.null v2.1.2

@esteban1983cl
Copy link

esteban1983cl commented Jul 18, 2020

Same issue here and I have included the basic example reference:

Module version 12.2.0
Terraform v0.12.28

  • provider.aws v2.70.0
  • provider.external v1.2.0
  • provider.kubernetes v1.11.3
  • provider.local v1.4.0
  • provider.null v2.1.2
  • provider.random v2.3.0
  • provider.template v2.1.2

I'm using worker_groups_launch_template configuration

data "aws_eks_cluster" "swatops_cluster" {
  count = local.swatops_create_cluster ? 1 : 0
  name  = module.swatops_eks.cluster_id
}

data "aws_eks_cluster_auth" "swatops_cluster" {
  count = local.swatops_create_cluster ? 1 : 0
  name  = module.swatops_eks.cluster_id
}

# In case of not creating the cluster, this will be an incompletely configured, unused provider, which poses no problem.
provider "kubernetes" {
  host                   = element(concat(data.aws_eks_cluster.swatops_cluster[*].endpoint, list("")), 0)
  cluster_ca_certificate = base64decode(element(concat(data.aws_eks_cluster.swatops_cluster[*].certificate_authority.0.data, list("")), 0))
  token                  = element(concat(data.aws_eks_cluster_auth.swatops_cluster[*].token, list("")), 0)
  load_config_file       = false
  version                = "~> 1.11"
}

I have other clusters created with other version of this module but very older (5.0.0). Maybe this configuration is overlapping from my other cluster ?

@esteban1983cl
Copy link

This error happens when I use linux but don't when I use mac os.

@esteban1983cl
Copy link

OK, I solve my issue using this configuration:

data "aws_eks_cluster" "green_cluster" {
  count = local.green_cluster ? 1 : 0
  name  = module.green_eks.cluster_id
}

data "aws_eks_cluster_auth" "green_cluster" {
  count = local.green_create_cluster ? 1 : 0
  name  = module.swatops_eks.cluster_id
}

# In case of not creating the cluster, this will be an incompletely configured, unused provider, which poses no problem.
provider "kubernetes" {
  alias                   = "green"
  host                   = element(concat(data.aws_eks_cluster.green_cluster[*].endpoint, list("")), 0)
  cluster_ca_certificate = base64decode(element(concat(data.aws_eks_cluster.green_cluster[*].certificate_authority.0.data, list("")), 0))
  token                  = element(concat(data.aws_eks_cluster_auth.green_cluster[*].token, list("")), 0)
  load_config_file       = false
  version                = "~> 1.11"
}

module "green_eks" {
  source                                = "terraform-aws-modules/eks/aws"
  version                               = "12.2.0"
  providers = {
    kubernetes = kubernetes.green
  }
  create_eks                            = local.green.create_eks
  cluster_name                          = local.green.cluster_name
  cluster_version                       = local.green.cluster_version
  write_kubeconfig                      = local.green.write_kubeconfig
  iam_path                              = local.green.iam_path
  map_roles                             = local.green.map_roles
  map_users                             = local.green.map_users
  subnets                               = local.green.subnets
  vpc_id                                = local.green.vpc_id
  attach_worker_cni_policy              = local.green.attach_worker_cni_policy
  workers_group_defaults                = local.green.workers_group_defaults
  worker_additional_security_group_ids  = [aws_security_group.green_worker_management.id]
  workers_additional_policies           = [aws_iam_policy.green_default_cluster_policy.arn  ]
  worker_groups_launch_template         = local.green.worker_groups_launch_template
  enable_irsa                           = local.green.enable_irsa
  tags                                  = local.green.tags
}

I hope it serves someone
bye

@ivanmartos
Copy link

maybe my experience will help someone... I had an existing EKS cluster created using module version v6.0.1 and I also had prefilled var map_roles. After updating to module version 12.2.0 I also started seeing error

Error: configmaps "aws-auth" already exists

I had corrrectly set up kubernetes provider, but the issue was still there.

I resolved it very easily - I imported the aws-auth config map manually

terraform import module.MODULE_NAME.kubernetes_config_map.aws_auth[0] kube-system/aws-auth

that solved it for me

Looks like when adding this resource possibility of pre-existing aws-auth config map did not come to mind :)

@nonai
Copy link

nonai commented Aug 6, 2020

Thank you @ivanmartos , it worked for me.

For my case, I was having this config:
provider "kubernetes" { load_config_file = true version = "~> 1.9" }

The first TF run went fine without errors. However from next run, it was throwing the error:
module.bjn_eks_indigo.module.eks.kubernetes_config_map.aws_auth[0]: Creating...
Error: configmaps "aws-auth" already exists

@philicious
Copy link
Contributor

what might help solve at least some peoples problems, make sure that:

  • you have a provider "kubernetes" block as seen in examples, so terraform can authenticate and talk to the new cluster
  • if having multiple clusters in same file, make sure you also have multiple kubernetes providers with alias and then referencing the correct one in each eks module like
  providers = {
    kubernetes = kubernetes.eks-unmanaged
  } 

so TF doesnt use the wrong one (or default one) and sees an truly eixsting configmap but in the wrong cluster

@charandas
Copy link

charandas commented Sep 28, 2020

I am doing everything as prescribed. Using aliases, using kubernetes provider et cetera. I run into this error whenever I start a cluster with manage_auth=False, and then at a later date try to add manage_auth.

I have a theory about what the issue is. This provider didn't create the configmap, but AWS EKS must have some background jobs that run. When one starts a cluster without creating a configmap for aws-auth with manage_auth, there is a state drift from terraform perspective, and we start seeing this issue.

One can simply do this kubectl delete cm aws-auth -n kube-system. Of course, only do this if you are willing to loose any configuration therein, as after that, only cluster bootstrapper can kubectl into the cluster. Since terraform is my bootstrapper, deleting doesn't cause any pain.

@soorajpv286
Copy link

facing the same issue with below code snippet to add custom users in aws-auth while creating EKS cluster with terraform

provider "aws" {
region = "us-east-1"
}
data "aws_eks_cluster" "cluster" {
name = module.my-cluster.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
name = module.my-cluster.cluster_id
}

module "my-cluster" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "my-cluster"
cluster_version = "1.17"
subnets = ["subnet_1", "subnet_2"]
vpc_id = "vpc_id"

worker_groups = [
{
instance_type = "m4.large"
asg_max_size = 1
}
]
}
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
load_config_file = false
}
resource "kubernetes_config_map" "aws_auth_configmap" {
metadata {
name = "aws-auth"
namespace = "kube-system"
}
data = {
mapRoles = <<YAML

  • rolearn: ${aws_iam_role.eks_kubectl_role.arn}
    username: system:node:{{EC2PrivateDNSName}}
    groups:
    • system:bootstrappers
    • system:nodes
  • rolearn: ${aws_iam_role.eks_kubectl_role.arn}
    username: kubectl-access-user
    groups:
    • system:masters
      YAML
      }
      }

@IronCore864
Copy link

IronCore864 commented Nov 4, 2020

I think the problem is, with managed nodegroup, aws-auth configmap is already created, and terraform Kubernetes provider resource kubernetes_config_map does not support "upsert".
There is probably no way to fix this at the moment except removing this resource then update the aws-auth configmap manually after cluster/nodegroup creation is done.
Am I wrong? Did I miss something here?

@instantlinux
Copy link

instantlinux commented Nov 10, 2020

I just launched using 13.1.0 of terraform-aws-modules/eks/aws, using a colleague's configuration which specified manage_aws_auth=false. Google led me straight here after I was unable to access or fix the new cluster. This is a painful bug because it takes forever to iterate on EKS create/destroy. +1 for someone fixing it, thanks!

@instantlinux
Copy link

A colleague working with me tonight came up with steps-to-reproduce this issue with 13.1.0:

  • Specify one or more node_groups as part of your module "eks" resource definition
  • Invoke terraform apply
  • Attempt to perform terraform apply a kubernetes_config_map with the name aws-auth in namespace kube-system

You will then see the error described in this issue's description. Quoting my colleague's explanation about what's apparently happening under the hood with EKS:

When you use managed node groups, AWS takes care of provisioning EC2 instances for your cluster and (this is key) joining them to your cluster. In order for this to work the EC2 node groups need to be told what IAM role to assume so they can join the cluster. This is derived from the aws-auth configmap, and creating node groups forces the creation of the configmap (setting manage_aws_auth to false has no effect)

We want to control the aws-auth configmap in the jobs repo for other reasons.

This is problematic for us (and for no doubt many others who encounter this ticket in a google search) because the API key and secret used by terraform is often (especially if it's the TFE cloud service) different from the one a user might have access to when invoking aws eks cli commands at a shell prompt. You're locked out with no way into your new cluster.

@barryib
Copy link
Member

barryib commented Nov 12, 2020

I'm probably missing something. I played a lot of time the MNG example to start managed node groups for almost 5 times without errors.

A colleague working with me tonight came up with steps-to-reproduce this issue with 13.1.0:

  • Specify one or more node_groups as part of your module "eks" resource definition
  • Invoke terraform apply
  • Attempt to perform terraform apply a kubernetes_config_map with the name aws-auth in namespace kube-system

What do you mean by your third point ? Are you doing another terraform apply with your own kubernetes_config_map configuration ? Or are you referring to https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/aws_auth.tf#L65 when manage_aws_auth=true ?

When you use managed node groups, AWS takes care of provisioning EC2 instances for your cluster and (this is key) joining them to your cluster. In order for this to work the EC2 node groups need to be told what IAM role to assume so they can join the cluster. This is derived from the aws-auth configmap, and creating node groups forces the creation of the configmap (setting manage_aws_auth to false has no effect)

Yes. AWS create the aws-auth configmap for managed node groups. That's why you have to ensure a correct dependency during your ressource creations:

  1. Create your cluster
  2. Create your aws-auth configmap
  3. When everything is setup, create your node groups

For the records, there were probably a race condition with dependencies in pre v12.2.0, but should be solved with #867. That PR add an explicit depends on the aws-auth configmap. That means, Terraform will start creating managed node groups only when the aws-auth configmap is created by kubernetes_config_map.

FWIW, there are 2 ways to manage the aws-auth configmap:

  1. Let the module do that for you with manage_aws_auth=true (default). Since we use kubernetes provider, you have to configure it.
  2. Manage it by your self and set manage_aws_auth=false.

In both case, if you use kubernetes provider, don't forget that Terraform can't manage existing resources if they don't exist in it's state. So you have to ensure that the configmap doesn't exist (this is the case for new cluster), or you have to import it first (if it already exist). See https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/config_map#import.

@instantlinux
Copy link

instantlinux commented Nov 13, 2020

I'll repeat: if you just give it a tf definition, that includes a node group, you can reproduce the problem (regardless of whether or how you set the manage_aws_auth flag). As a workaround, it should not be necessary to do a 3-step manual procedure "create your cluster", then "create your configmap", then "create your node groups". TF is supposed to be able to automate such procedures.

module "eks" {
  source                        = "terraform-aws-modules/eks/aws"
  version                       = "13.1.0"
  providers = {
    aws     = aws.ash
    aws.ash = aws.ash
  }

  cluster_name                  = "${local.cluster_name}-${local.env}"
  cluster_create_security_group = false
  cluster_iam_role_name         = aws_iam_role.k8s_control_plane.name
  cluster_log_kms_key_id        = aws_kms_key.dev_ash_k8s.arn
  cluster_security_group_id     = aws_security_group.k8s-control_plane.id
  create_eks                    = true
  subnets = concat(data.terraform_remote_state.dev_network.outputs.dev_ash_vpc["preferred_subnets"],
  data.terraform_remote_state.dev_network.outputs.dev_ash_vpc["preferred_public_subnets"])
  worker_create_security_group = false
  vpc_id                       = data.terraform_remote_state.dev_network.outputs.dev_ash_vpc["vpc_id"]

  node_groups_defaults = {
    ami_type         = "AL2_x86_64"
    desired_capacity = 1
    iam_role_arn     = aws_iam_role.k8s_nodes.arn
    min_capacity = 1
    subnets      = data.terraform_remote_state.dev_network.outputs.dev_ash_vpc["preferred_subnets"]
  }

  node_groups = {
    nodes1 = {
      disk_size     = 20
      max_capacity  = 3
      instance_type = "t3.medium"
    }
  }
}

@barryib
Copy link
Member

barryib commented Nov 13, 2020

I'll repeat: if you just give it a tf definition, that includes a node group, you can reproduce the problem (regardless of whether or how you set the manage_aws_auth flag). As a workaround, it should not be necessary to do a 3-step manual procedure "create your cluster", then "create your configmap", then "create your node groups". TF is supposed to be able to automate such procedures.

What you're calling 3-step manual procedure is already done by this module. I just explained step by step what this module do for worker groups, managed node groups and fargate profiles:

  1. the module creates the cluster
  2. the module creates the aws-auth configmap if manage_aws_auth is true
  3. the module creates worker groups or MNG or fargate profiles

My english is probably not very good, but I think that the meaning is there.

@barryib
Copy link
Member

barryib commented Nov 13, 2020

Plus if you re-read my previous comment, if you set managed_aws_auth to false, you have to manage the aws-auth configmap by your self before the MNG creation. Otherwise AWS creates it during the MNG creation (in this case, you can't use directly the kubernetes provider, you have to import the confimap first or use kubectl).

@schollii
Copy link

schollii commented Nov 20, 2020

I can confirm that when using managed node group (MNG), setting manage_aws_auth to false does not seem to have any effect: an aws-auth configmap ends up being created anyways. This makes sense because an auth config map is required in order for instances to join the node group. This is not the case for non-managed worker nodes.

Once I deleted the configmap from the cluster, it worked (this should be safe for node-group EKS non-prod cluster since the map will be created within a minute, but see @barryib post below he makes a good point):

$ kubectl delete cm aws-auth -n kube-system --kubeconfig path/to/kubeconfig
configmap "aws-auth" deleted

Then terraform apply succeeded. There was no need to delete it from terraform state.

If instead I delete the cm from the terraform state (eg terraform state rm module.YOUR_MODULE.data.aws_eks_cluster_auth.cluster[0]), then terraform refresh, terraform state list showed that it came back -- the k8s provider saw it in the cluster but not in the state file so fixed that.

@barryib
Copy link
Member

barryib commented Nov 20, 2020

@schollii deleting the configmap can be dangerous, because you can loose access to your cluster.

In that situation, I'll suggest to import the configmap instead https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/config_map#import and #852 (comment)

$ terraform import module.MODULE_NAME.kubernetes_config_map.aws_auth[0] kube-system/aws-auth

See also #852 (comment), because Terraform works that way

if you use kubernetes provider, don't forget that Terraform can't manage existing resources if they don't exist in its state. So you have to ensure that the configmap doesn't exist (this is the case for new cluster), or you have to import it first (if it already exist)

@theothermike
Copy link

FYI I filed this in the AWS Provider github which may fix it: hashicorp/terraform-provider-aws#17333

@schollii
Copy link

schollii commented Feb 3, 2021

Thanks @barryib the import worked for me, safer than deleting the configmap and perhaps better success rate than deleting the tf resource from state (eg this did not work for me).

@Eslamanwar
Copy link

Prepare for production down time due to this bug
if you have multi cluster configured in your kubeconfig , it will delete aws_auth ,leaving your all nodes in unhealthy state
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/v11.1.0/node_groups.tf#L12
This should not be configured like that.

@francoisfaubert
Copy link

francoisfaubert commented Mar 17, 2021

  providers = {
    kubernetes = kubernetes.eks-unmanaged
  } 

Starting from that code sample and changing it for my use-case fixed both this issue and the symptoms described in #699 for me after upgrading.

If you see a lot of tickets created around these topics, then I would humbly suggest adding an exemple for configuring multiple providers in the project's documentation. (Hopefully it's not just that I kept missing information already somewhere).

@barryib
Copy link
Member

barryib commented May 28, 2021

@francoisfaubert you're probably right. Feel free to improve docs. I'll happy to review it.

@harsh-cldcvr
Copy link

can anyone please help me ?

using to create the EKS cluster
main.tf

locals {
  namespace = "test"
  stage     = "dev"

  kubernetes_version         = "1.20"
  tags                       = merge(module.label.tags, tomap({ "kubernetes.io/cluster/${module.label.id}" = "shared" }))
  eks_worker_ami_name_filter = "amazon-eks-node-${local.kubernetes_version}*"
  public_subnets_additional_tags = {
    "kubernetes.io/role/elb" : 1
  }
  private_subnets_additional_tags = {
    "kubernetes.io/role/internal-elb" : 1
  }
}

module "label" {
  source     = "cloudposse/label/null"
  version    = "0.24.1"
  attributes = ["${local.namespace}-${local.stage}-cluster"]
}

module "eks_cluster" {
  source  = "cloudposse/eks-cluster/aws"
  version = "0.39.0"

  namespace = local.namespace
  stage     = local.stage
  tags      = local.tags

  region                       = "REGION"
  vpc_id                       = data.terraform_remote_state.vpc.outputs.vpc.vpc_id
  subnet_ids                   = concat(module.subnets.private_subnet_ids, module.subnets.public_subnet_ids)
  kubernetes_version           = local.kubernetes_version
  local_exec_interpreter       = ["/bin/sh", "-c"]
  oidc_provider_enabled        = true
  enabled_cluster_log_types    = ["api"]
  cluster_log_retention_period = 7

  cluster_encryption_config_enabled                         = true
  cluster_encryption_config_kms_key_id                      = ""
  cluster_encryption_config_kms_key_enable_key_rotation     = true
  cluster_encryption_config_kms_key_deletion_window_in_days = 30
  cluster_encryption_config_kms_key_policy                  = null
  cluster_encryption_config_resources                       = ["secrets"]

  kubernetes_config_map_ignore_role_changes = var.kubernetes_config_map_ignore_role_changes
  workers_role_arns                         = var.existing_worker_role_arns
  
}

module "eks_node_group" {
  source  = "cloudposse/eks-node-group/aws"
  version = "0.20.0"

  namespace = local.namespace
  stage     = local.stage
  tags      = local.tags

  subnet_ids = module.subnets.private_subnet_ids
  cluster_name = (var.existing_cluster_name == "" ?
    module.eks_cluster.eks_cluster_id
  : var.existing_cluster_name)
  instance_types = var.instance_types
  desired_size   = var.desired_size
  min_size       = var.min_size
  max_size       = var.max_size
  disk_size      = var.vm_pd_ssd_size
}

mydata.tf

data "null_data_source" "wait_for_cluster_and_kubernetes_configmap" {
  inputs = {
    cluster_name             = module.eks_cluster.eks_cluster_id
    kubernetes_config_map_id = module.eks_cluster.kubernetes_config_map_id
  }
}

data "aws_eks_cluster" "cluster" {
  name = (var.existing_cluster_name == "" ?
    module.eks_cluster.eks_cluster_id
  : var.existing_cluster_name)
}

data "aws_eks_cluster_auth" "eks" {
  name = (var.existing_cluster_name == "" ?   data.null_data_source.wait_for_cluster_and_kubernetes_configmap.outputs["cluster_name"]
  : var.existing_cluster_name)
}

data "terraform_remote_state" "xyz" {
  backend = "s3"

  config = {
    bucket   = "BUCKET-name"
    key      = "state-key"
    region   = "REGION"
  }
}

varible.tf

variable "kubernetes_config_map_ignore_role_changes" {
  type        = bool
  default     = true
}

variable "existing_cluster_name" {
  type        = string
  default     = ""
}

i am able to create the cluster but getting the error when re-applying some changes.

Error: configmaps "aws-auth" already exists
│ 
│  with module.eks_cluster.kubernetes_config_map.aws_auth_ignore_changes[0],
│  on .terraform/modules/eks_cluster/auth.tf line 83, in resource "kubernetes_config_map" "aws_auth_ignore_changes":
│  83: resource "kubernetes_config_map" "aws_auth_ignore_changes" {

Note : while re-applying updating the value in varibles.tf for existing cluster with name of cluster.

Anyone faced this error before ?

Thanks in advance.

@philomory
Copy link

@harsh-cldcvr It looks like you're using the cloudposse EKS module, not this module. You might want to try the issue tracker in that repository instead.

@daroga0002
Copy link
Contributor

I have go thru this issue and seems main problem is configuration where you create multiple clusters with irsa management from single code which is related to wrong kubernetes provider setup.

Could somebody confirm my understanding?

@stale
Copy link

stale bot commented Oct 9, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 9, 2021
@stale
Copy link

stale bot commented Oct 16, 2021

This issue has been automatically closed because it has not had recent activity since being marked as stale.

@stale stale bot closed this as completed Oct 16, 2021
@xpaulnim
Copy link

xpaulnim commented Dec 3, 2021

@daroga0002 that was the problem in my case. solution was to use multiple kubernetes providers (per EKS cluster) each aliased and using its own set of aws_eks_cluster_auth and aws_eks_cluster_auth.

@CodebashingDevOps
Copy link

CodebashingDevOps commented Dec 21, 2021

Any workaround around this? 1 cluster is online, the second is created, node group is created and then is says configmap already created...
Reset the configmap manually on both clusters but when running terraform again -only the first cluster is modified with the roles, even though it says cluster v2 will be updated

# module.cluster_2.kubernetes_config_map.aws_auth[0] will be updated in-place
~ resource "kubernetes_config_map" "aws_auth" {
        binary_data = {}
      ~ data        = {
            "mapAccounts" = <<~EOT
                - "123456789"
            EOT
          ~ "mapRoles"    = <<~EOT
                - "groups":
                  - "system:bootstrappers"
                  - "system:nodes"
              -   "rolearn": "arn:aws:iam::123456789:role/role-cluster1"
              +   "rolearn": "arn:aws:iam::123456789:role/role-cluster2"
                  "username": "system:node:{{EC2PrivateDNSName}}"
             

I confirmed the role-cluster2 is present is cluster2, but nevertheless its still looking on the first one and see misconfiguration..

@theothermike
Copy link

theothermike commented Dec 21, 2021

We faced a similar problem at square - how do we change aws-auth since it's provisioned by EKS thru Terraform?

The solution was not to use TF, but rather we invented a ConfigMapMergeController (CMMC) that would manage the aws-auth configmap from other source configmaps - essentially collate a bunch of source configmaps with certain annotations, dedupe, sort and mutate update the aws-auth configmap with the result

We used it to add/remove business services, where we used kubernetes namespaces as source/inputs for the mapRoles and mapUsers. That way we could use TF (or flux/argocd) to provision the new namespace, and a configmap INSIDE the namespace, and the CMMC would pick this up and merge it into aws-auth. If we delete the namespace, the configmap would go away, and then the CMMC would prune it from aws-auth as well.

We had also intended to use this to do blue/green EKS cluster transitions, which I believe would solve your problem here.

https://github.com/cashapp/cmmc

@Maxwell2022
Copy link

If you still face this issue after creating the EKS cluster, and you want to have full control from Terraform, then you can solve it with the following:

Note: we are using Fargate

# Cluster authorization access level
resource "kubernetes_config_map_v1_data" "aws-auth" {
  force = true

  data = {
    "mapRoles" = templatefile("${path.module}/data/role-map-config.tftpl", {
      execution_role_arn = var.execution_role_arn
      admin_role_arn     = var.admin_role_arn
    })
  }

  metadata {
    name      = "aws-auth"
    namespace = "kube-system"
  }
}

@github-actions
Copy link

github-actions bot commented Nov 8, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests