Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid count argument #690

Open
tvvignesh opened this issue Sep 26, 2020 · 19 comments
Open

Invalid count argument #690

tvvignesh opened this issue Sep 26, 2020 · 19 comments
Assignees
Labels
bug Something isn't working triaged Scoped and ready for work upstream Work required on Terraform core or provider v0.13 Terraform v0.13 issue.

Comments

@tvvignesh
Copy link

Hi. I tried setting up gke private cluster (safer-cluster-update-variant) and whenever I make any errors (accidentaly giving the wrong image name or machine type and so on), the apply fails (not detected in plan) which is understandable.

But, if I try fixing the issue, and run plan and apply again, I get this:

Capture

It has been discussed here:
hashicorp/terraform#21450
hashicorp/terraform#12570

but I am not able to understand how to get over this.

I do understand that it is happening because it is not able to find any node pool in the cluster for which it can determine the count. If I go to .terraform/modules/global_gke.gke.gcloud_wait_for_cluster/main.tf I can see this line which is where the issue is.

resource "null_resource" "module_depends_on" {
  count = length(var.module_depends_on) > 0 ? 1 : 0

  triggers = {
    value = length(var.module_depends_on)
  }
}

Currently what I am doing is deleting the cluster every time and re-creating it from scratch. May I know how I can avoid doing that and just fix this issue? Thanks.

@bharathkkb
Copy link
Member

Hi @tvvignesh
Could you let me know which version of TF and GKE module and if not version = "~> 11.1.0" could you try that?

@tvvignesh
Copy link
Author

@bharathkkb Hi. Running in TF version v0.13.2 and GKE 1.18.6-gke.4801 an latest version of this module.

@bharathkkb
Copy link
Member

@tvvignesh could you provide your config, I can try to reproduce

@tvvignesh
Copy link
Author

@bharathkkb Sure. This would be the relevant portion of the config. Kindly replace the vars where necessary.

module "global_gke" {
  source = "../modules/safer-cluster-update-variant"

  description                     = "My Cluster"
  project_id                      = module.global_enabled_google_apis.project_id
  name                            = var.global_cluster_name
  region                          = var.global_region
  network                         = module.global_vpc.network_name
  subnetwork                      = module.global_vpc.subnets_names[0]
  horizontal_pod_autoscaling      = true
  enable_vertical_pod_autoscaling = true
  enable_pod_security_policy      = true
  http_load_balancing             = true
  gce_pd_csi_driver               = true
  monitoring_service              = "none"
  logging_service                 = "none"
  release_channel                 = "RAPID"
  enable_shielded_nodes           = true
  ip_range_pods                   = module.global_vpc.subnets_secondary_ranges[0].*.range_name[0]
  ip_range_services               = module.global_vpc.subnets_secondary_ranges[0].*.range_name[1]
  master_authorized_networks = [{
    cidr_block   = "${module.global_bastion.ip_address}/32"
    display_name = "Global Bastion Host"
  }]
  grant_registry_access = true
  node_pools = [
    {
      name            = "global-pool-1"
      machine_type    = "n1-standard-4"
      min_count       = 1
      max_count       = 20
      local_ssd_count = 0
      disk_size_gb    = 30
      disk_type       = "pd-ssd"
      image_type      = "UBUNTU_CONTAINERD"
      auto_repair     = true
      auto_upgrade    = true
      node_metadata   = "GKE_METADATA_SERVER"
      service_account = "${var.global_sa}"
      preemptible     = false
    }
  ]
}

@halkyon
Copy link

halkyon commented Sep 29, 2020

Having the exact same issue as well. Seems to only happen when you've made an error, and once it gets in this state you can't terraform destroy to start again either.

@morgante
Copy link
Contributor

@halkyon What was the error you made? Reproducing this will likely require us to see your broken config.

@halkyon
Copy link

halkyon commented Oct 1, 2020

@morgante Here you go: https://github.com/halkyon/gke-beta-private-cluster-example

Using Terraform v0.13.4.

Change the values in terraform.tfvars to your liking, and do a terraform init && terraform apply to provision a new cluster. Now change the machine_type value in the node_pools variable in terraform.tfvars to something invalid, then terraform apply again, and you'll get an error as expected. Now fix that back up to e2-medium or another valid type, and terraform apply again. This error is shown:

Error: Invalid count argument

  on .terraform/modules/gke.gcloud_delete_default_kube_dns_configmap/main.tf line 63, in resource "null_resource" "module_depends_on":
  63:   count = length(var.module_depends_on) > 0 ? 1 : 0

The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.

Hope this helps!

@bharathkkb bharathkkb self-assigned this Oct 1, 2020
@mspinassi-medallia
Copy link

Exact same issue here.

@bharathkkb
Copy link
Member

I was able to reproduce this with 0.13.4; seems like after the node pool config errors out, TF is unable to resolve [for pool in google_container_node_pool.pools : pool.name] at plan time. I'll do some more digging for a fix and see if its just for 0.13.4 or all 0.13.x.

Works as intended with 0.12.29.

@bharathkkb bharathkkb added the v0.13 Terraform v0.13 issue. label Oct 7, 2020
@innovia
Copy link

innovia commented Oct 8, 2020

Any updates , this happens to me to with 13.4 and after upgrading the node pool

@innovia
Copy link

innovia commented Oct 14, 2020

what's up with this? if the module fail and its easy to replicate if you put invalid machine type say for example e2-medium-2 it fails on this error as if its in a bad state.

can you please fix this?

@morgante
Copy link
Contributor

Since this is working in Terraform 0.12.x but not in 0.13.x I'm inclined to believe this is a Terraform Core issue. We can attempt to workaround it but it's not a high priority when Core should be fixing it.

@bharathkkb
Copy link
Member

I was able to create a light repro which works with 0.12.x and not with 0.13.4. I will open an issue in core.
A workaround seems to be to use terraform apply -refresh=false which bypasses the initial refresh that throws this error.

@github-actions
Copy link

github-actions bot commented Jan 5, 2021

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the Stale label Jan 5, 2021
@bharathkkb bharathkkb removed the Stale label Jan 7, 2021
@github-actions
Copy link

github-actions bot commented Mar 8, 2021

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the Stale label Mar 8, 2021
@morgante morgante added bug Something isn't working triaged Scoped and ready for work and removed Stale labels Mar 8, 2021
@AlexBulankou
Copy link
Contributor

I'm getting this issue with terraform:0.14.7, during tf plan phase:

Error: Invalid count argument

  on .terraform/modules/config_sync.configsync_operator.k8sop_manifest/main.tf line 57, in resource "random_id" "cache":
  57:   count = (! local.skip_download) ? 1 : 0

The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.

Any suggestions on the workaround?

@morgante
Copy link
Contributor

@AlexBulankou Is this for a fresh deploy? What does your module configuration look like?

@AlexBulankou
Copy link
Contributor

Yes, this is a fresh deploy: module confg.

@AlexBulankou
Copy link
Contributor

To follow-up, the workaround for me was to back to terraform:0.12.29.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged Scoped and ready for work upstream Work required on Terraform core or provider v0.13 Terraform v0.13 issue.
Projects
None yet
Development

No branches or pull requests

7 participants