Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kops 1.18 validate cluster - empty NODE STATUS table #9100

Closed
fred-vogt opened this issue May 8, 2020 · 16 comments
Closed

kops 1.18 validate cluster - empty NODE STATUS table #9100

fred-vogt opened this issue May 8, 2020 · 16 comments
Assignees

Comments

@fred-vogt
Copy link

fred-vogt commented May 8, 2020

Noticed this while testing the lastest 1.18.0 alpha and its TF 12 output.

kops validate cluster --name=...  --state=s3://...
Validating cluster ...

INSTANCE GROUPS
NAME			ROLE	MACHINETYPE	MIN	MAX	SUBNETS
master-us-west-2a	Master	c5d.large	1	1	us-west-2a
master-us-west-2b	Master	c5d.large	1	1	us-west-2b
master-us-west-2c	Master	c5d.large	1	1	us-west-2c
nodes-us-west-2a	Node	r5dn.xlarge	1	1	us-west-2a
nodes-us-west-2b	Node	r5dn.xlarge	1	1	us-west-2b
nodes-us-west-2c	Node	r5dn.xlarge	1	1	us-west-2c

NODE STATUS
NAME	ROLE	READY

Your cluster ... is ready

1. What kops version are you running? The command kops version, will display
this information.

$ kops version
Version 1.18.0-alpha.3 (git-27aab12b2)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:56:40Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:48:36Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

3. What cloud provider are you using?
AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

kops validate cluster --name=...  --state=s3://...

5. What happened after the commands executed?

NODE STATUS table has no detail rows

6. What did you expect to happen?

Node status table would have an entry for each EC2 instance in the cluster

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

The node list was successfully queried for:
GET https://.../api/v1/nodes 200 OK in 86 milliseconds
Response Headers:
    Audit-Id: 0e1b9784-1725-4e78-85b3-d1977689243c
    Content-Type: application/json
    Date: Fri, 08 May 2020 18:05:49 GMT

9. Anything else do we need to know?

$ kubectl get nodes
NAME                                          STATUS   ROLES    AGE     VERSION
ip-10-16-101-19.us-west-2.compute.internal    Ready    node     3h14m   v1.18.2
ip-10-16-108-253.us-west-2.compute.internal   Ready    master   3h26m   v1.18.2
ip-10-16-39-239.us-west-2.compute.internal    Ready    master   3h26m   v1.18.2
ip-10-16-54-131.us-west-2.compute.internal    Ready    node     3h14m   v1.18.2
ip-10-16-65-211.us-west-2.compute.internal    Ready    node     3h14m   v1.18.2
ip-10-16-72-148.us-west-2.compute.internal    Ready    master   3h26m   v1.18.2
@fred-vogt
Copy link
Author

I can provide a cluster manifest or the response body from /api/v1/nodes if that is helpful.

@olemarkus
Copy link
Member

More information would be useful. I cannot reproduce this.

@johngmyers
Copy link
Member

Assuming it's not a case of every instance being skipped with the message ignoring node with role the only other case I can see is if awsCloudImplementation.GetCloudGroups() found zero non-bastion ASGs.

It looks like cluster validation might be blind to missing ASGs.

@johngmyers
Copy link
Member

/assign

@fred-vogt
Copy link
Author

@olemarkus , @johngmyers - thanks getting on this so fast.

Some extra info: - this is setup with mostly pre-created AWS resources, and TF 12 output.
The cluster full name is {cluster}-{account}.kops.{domain} -> 1-18a-dev.kops.cicdenv.com.

I'll start the cluster back up and add more details shortly.

@fred-vogt
Copy link
Author

kops-9100-ASGs-2020-05-10 23-40-38
kops-9100-ec2-instances-2020-05-10 23-42-29

@fred-vogt
Copy link
Author

TF 12 output from kops+update -output terraform:

locals {
  cluster_name                 = "1-18a-dev.kops.cicdenv.com"
  master_autoscaling_group_ids = [aws_autoscaling_group.master-us-west-2a-masters-1-18a-dev-kops-cicdenv-com.id, aws_autoscaling_group.master-us-west-2b-masters-1-18a-dev-kops-cicdenv-com.id, aws_autoscaling_group.master-us-west-2c-masters-1-18a-dev-kops-cicdenv-com.id]
  master_security_group_ids    = [aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, "sg-01b7dec4801ca3963", "sg-01b7dec4801ca3963", "sg-01b7dec4801ca3963"]
  node_autoscaling_group_ids   = [aws_autoscaling_group.nodes-us-west-2a-1-18a-dev-kops-cicdenv-com.id, aws_autoscaling_group.nodes-us-west-2b-1-18a-dev-kops-cicdenv-com.id, aws_autoscaling_group.nodes-us-west-2c-1-18a-dev-kops-cicdenv-com.id]
  node_security_group_ids      = [aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, "sg-085fcc2a8cfe305e5", "sg-085fcc2a8cfe305e5", "sg-085fcc2a8cfe305e5"]
  node_subnet_ids              = ["subnet-0230bf84c0ca70352", "subnet-0d9574ecb5a5ac0ee", "subnet-0f984b3662e26d4e6"]
  region                       = "us-west-2"
  subnet_ids                   = ["subnet-0230bf84c0ca70352", "subnet-094d995cd873d3728", "subnet-0af21c86ed70071c8", "subnet-0d9574ecb5a5ac0ee", "subnet-0f984b3662e26d4e6", "subnet-0fe6e7b15f31088c5"]
  subnet_us-west-2a_id         = "subnet-0230bf84c0ca70352"
  subnet_us-west-2b_id         = "subnet-0d9574ecb5a5ac0ee"
  subnet_us-west-2c_id         = "subnet-0f984b3662e26d4e6"
  subnet_utility-us-west-2a_id = "subnet-0fe6e7b15f31088c5"
  subnet_utility-us-west-2b_id = "subnet-094d995cd873d3728"
  subnet_utility-us-west-2c_id = "subnet-0af21c86ed70071c8"
  vpc_id                       = "vpc-05e996360e6d81043"
}

output "cluster_name" {
  value = "1-18a-dev.kops.cicdenv.com"
}

output "master_autoscaling_group_ids" {
  value = [aws_autoscaling_group.master-us-west-2a-masters-1-18a-dev-kops-cicdenv-com.id, aws_autoscaling_group.master-us-west-2b-masters-1-18a-dev-kops-cicdenv-com.id, aws_autoscaling_group.master-us-west-2c-masters-1-18a-dev-kops-cicdenv-com.id]
}

output "master_security_group_ids" {
  value = [aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, "sg-01b7dec4801ca3963", "sg-01b7dec4801ca3963", "sg-01b7dec4801ca3963"]
}

output "node_autoscaling_group_ids" {
  value = [aws_autoscaling_group.nodes-us-west-2a-1-18a-dev-kops-cicdenv-com.id, aws_autoscaling_group.nodes-us-west-2b-1-18a-dev-kops-cicdenv-com.id, aws_autoscaling_group.nodes-us-west-2c-1-18a-dev-kops-cicdenv-com.id]
}

output "node_security_group_ids" {
  value = [aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, "sg-085fcc2a8cfe305e5", "sg-085fcc2a8cfe305e5", "sg-085fcc2a8cfe305e5"]
}

output "node_subnet_ids" {
  value = ["subnet-0230bf84c0ca70352", "subnet-0d9574ecb5a5ac0ee", "subnet-0f984b3662e26d4e6"]
}

output "region" {
  value = "us-west-2"
}

output "subnet_ids" {
  value = ["subnet-0230bf84c0ca70352", "subnet-094d995cd873d3728", "subnet-0af21c86ed70071c8", "subnet-0d9574ecb5a5ac0ee", "subnet-0f984b3662e26d4e6", "subnet-0fe6e7b15f31088c5"]
}

output "subnet_us-west-2a_id" {
  value = "subnet-0230bf84c0ca70352"
}

output "subnet_us-west-2b_id" {
  value = "subnet-0d9574ecb5a5ac0ee"
}

output "subnet_us-west-2c_id" {
  value = "subnet-0f984b3662e26d4e6"
}

output "subnet_utility-us-west-2a_id" {
  value = "subnet-0fe6e7b15f31088c5"
}

output "subnet_utility-us-west-2b_id" {
  value = "subnet-094d995cd873d3728"
}

output "subnet_utility-us-west-2c_id" {
  value = "subnet-0af21c86ed70071c8"
}

output "vpc_id" {
  value = "vpc-05e996360e6d81043"
}

provider "aws" {
  region = "us-west-2"
}

resource "aws_autoscaling_attachment" "master-us-west-2a-masters-1-18a-dev-kops-cicdenv-com" {
  autoscaling_group_name = aws_autoscaling_group.master-us-west-2a-masters-1-18a-dev-kops-cicdenv-com.id
  elb                    = aws_elb.api-1-18a-dev-kops-cicdenv-com.id
}

resource "aws_autoscaling_attachment" "master-us-west-2b-masters-1-18a-dev-kops-cicdenv-com" {
  autoscaling_group_name = aws_autoscaling_group.master-us-west-2b-masters-1-18a-dev-kops-cicdenv-com.id
  elb                    = aws_elb.api-1-18a-dev-kops-cicdenv-com.id
}

resource "aws_autoscaling_attachment" "master-us-west-2c-masters-1-18a-dev-kops-cicdenv-com" {
  autoscaling_group_name = aws_autoscaling_group.master-us-west-2c-masters-1-18a-dev-kops-cicdenv-com.id
  elb                    = aws_elb.api-1-18a-dev-kops-cicdenv-com.id
}

resource "aws_autoscaling_group" "master-us-west-2a-masters-1-18a-dev-kops-cicdenv-com" {
  enabled_metrics      = ["GroupDesiredCapacity", "GroupInServiceInstances", "GroupMaxSize", "GroupMinSize", "GroupPendingInstances", "GroupStandbyInstances", "GroupTerminatingInstances", "GroupTotalInstances"]
  launch_configuration = aws_launch_configuration.master-us-west-2a-masters-1-18a-dev-kops-cicdenv-com.id
  max_size             = 1
  metrics_granularity  = "1Minute"
  min_size             = 1
  name                 = "master-us-west-2a.masters.1-18a-dev.kops.cicdenv.com"
  tag {
    key                 = "KubernetesCluster"
    propagate_at_launch = true
    value               = "1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "Name"
    propagate_at_launch = true
    value               = "master-us-west-2a.masters.1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "k8s.io/cluster-autoscaler/node-template/label/kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "master-us-west-2a"
  }
  tag {
    key                 = "k8s.io/role/master"
    propagate_at_launch = true
    value               = "1"
  }
  tag {
    key                 = "kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "master-us-west-2a"
  }
  tag {
    key                 = "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com"
    propagate_at_launch = true
    value               = "owned"
  }
  vpc_zone_identifier = ["subnet-0230bf84c0ca70352"]
}

resource "aws_autoscaling_group" "master-us-west-2b-masters-1-18a-dev-kops-cicdenv-com" {
  enabled_metrics      = ["GroupDesiredCapacity", "GroupInServiceInstances", "GroupMaxSize", "GroupMinSize", "GroupPendingInstances", "GroupStandbyInstances", "GroupTerminatingInstances", "GroupTotalInstances"]
  launch_configuration = aws_launch_configuration.master-us-west-2b-masters-1-18a-dev-kops-cicdenv-com.id
  max_size             = 1
  metrics_granularity  = "1Minute"
  min_size             = 1
  name                 = "master-us-west-2b.masters.1-18a-dev.kops.cicdenv.com"
  tag {
    key                 = "KubernetesCluster"
    propagate_at_launch = true
    value               = "1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "Name"
    propagate_at_launch = true
    value               = "master-us-west-2b.masters.1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "k8s.io/cluster-autoscaler/node-template/label/kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "master-us-west-2b"
  }
  tag {
    key                 = "k8s.io/role/master"
    propagate_at_launch = true
    value               = "1"
  }
  tag {
    key                 = "kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "master-us-west-2b"
  }
  tag {
    key                 = "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com"
    propagate_at_launch = true
    value               = "owned"
  }
  vpc_zone_identifier = ["subnet-0d9574ecb5a5ac0ee"]
}

resource "aws_autoscaling_group" "master-us-west-2c-masters-1-18a-dev-kops-cicdenv-com" {
  enabled_metrics      = ["GroupDesiredCapacity", "GroupInServiceInstances", "GroupMaxSize", "GroupMinSize", "GroupPendingInstances", "GroupStandbyInstances", "GroupTerminatingInstances", "GroupTotalInstances"]
  launch_configuration = aws_launch_configuration.master-us-west-2c-masters-1-18a-dev-kops-cicdenv-com.id
  max_size             = 1
  metrics_granularity  = "1Minute"
  min_size             = 1
  name                 = "master-us-west-2c.masters.1-18a-dev.kops.cicdenv.com"
  tag {
    key                 = "KubernetesCluster"
    propagate_at_launch = true
    value               = "1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "Name"
    propagate_at_launch = true
    value               = "master-us-west-2c.masters.1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "k8s.io/cluster-autoscaler/node-template/label/kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "master-us-west-2c"
  }
  tag {
    key                 = "k8s.io/role/master"
    propagate_at_launch = true
    value               = "1"
  }
  tag {
    key                 = "kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "master-us-west-2c"
  }
  tag {
    key                 = "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com"
    propagate_at_launch = true
    value               = "owned"
  }
  vpc_zone_identifier = ["subnet-0f984b3662e26d4e6"]
}

resource "aws_autoscaling_group" "nodes-us-west-2a-1-18a-dev-kops-cicdenv-com" {
  enabled_metrics      = ["GroupDesiredCapacity", "GroupInServiceInstances", "GroupMaxSize", "GroupMinSize", "GroupPendingInstances", "GroupStandbyInstances", "GroupTerminatingInstances", "GroupTotalInstances"]
  launch_configuration = aws_launch_configuration.nodes-us-west-2a-1-18a-dev-kops-cicdenv-com.id
  max_size             = 1
  metrics_granularity  = "1Minute"
  min_size             = 1
  name                 = "nodes-us-west-2a.1-18a-dev.kops.cicdenv.com"
  tag {
    key                 = "KubernetesCluster"
    propagate_at_launch = true
    value               = "1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "Name"
    propagate_at_launch = true
    value               = "nodes-us-west-2a.1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "k8s.io/cluster-autoscaler/node-template/label/kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "nodes-us-west-2a"
  }
  tag {
    key                 = "k8s.io/role/node"
    propagate_at_launch = true
    value               = "1"
  }
  tag {
    key                 = "kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "nodes-us-west-2a"
  }
  tag {
    key                 = "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com"
    propagate_at_launch = true
    value               = "owned"
  }
  vpc_zone_identifier = ["subnet-0230bf84c0ca70352"]
}

resource "aws_autoscaling_group" "nodes-us-west-2b-1-18a-dev-kops-cicdenv-com" {
  enabled_metrics      = ["GroupDesiredCapacity", "GroupInServiceInstances", "GroupMaxSize", "GroupMinSize", "GroupPendingInstances", "GroupStandbyInstances", "GroupTerminatingInstances", "GroupTotalInstances"]
  launch_configuration = aws_launch_configuration.nodes-us-west-2b-1-18a-dev-kops-cicdenv-com.id
  max_size             = 1
  metrics_granularity  = "1Minute"
  min_size             = 1
  name                 = "nodes-us-west-2b.1-18a-dev.kops.cicdenv.com"
  tag {
    key                 = "KubernetesCluster"
    propagate_at_launch = true
    value               = "1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "Name"
    propagate_at_launch = true
    value               = "nodes-us-west-2b.1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "k8s.io/cluster-autoscaler/node-template/label/kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "nodes-us-west-2b"
  }
  tag {
    key                 = "k8s.io/role/node"
    propagate_at_launch = true
    value               = "1"
  }
  tag {
    key                 = "kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "nodes-us-west-2b"
  }
  tag {
    key                 = "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com"
    propagate_at_launch = true
    value               = "owned"
  }
  vpc_zone_identifier = ["subnet-0d9574ecb5a5ac0ee"]
}

resource "aws_autoscaling_group" "nodes-us-west-2c-1-18a-dev-kops-cicdenv-com" {
  enabled_metrics      = ["GroupDesiredCapacity", "GroupInServiceInstances", "GroupMaxSize", "GroupMinSize", "GroupPendingInstances", "GroupStandbyInstances", "GroupTerminatingInstances", "GroupTotalInstances"]
  launch_configuration = aws_launch_configuration.nodes-us-west-2c-1-18a-dev-kops-cicdenv-com.id
  max_size             = 1
  metrics_granularity  = "1Minute"
  min_size             = 1
  name                 = "nodes-us-west-2c.1-18a-dev.kops.cicdenv.com"
  tag {
    key                 = "KubernetesCluster"
    propagate_at_launch = true
    value               = "1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "Name"
    propagate_at_launch = true
    value               = "nodes-us-west-2c.1-18a-dev.kops.cicdenv.com"
  }
  tag {
    key                 = "k8s.io/cluster-autoscaler/node-template/label/kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "nodes-us-west-2c"
  }
  tag {
    key                 = "k8s.io/role/node"
    propagate_at_launch = true
    value               = "1"
  }
  tag {
    key                 = "kops.k8s.io/instancegroup"
    propagate_at_launch = true
    value               = "nodes-us-west-2c"
  }
  tag {
    key                 = "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com"
    propagate_at_launch = true
    value               = "owned"
  }
  vpc_zone_identifier = ["subnet-0f984b3662e26d4e6"]
}

resource "aws_ebs_volume" "a-etcd-events-1-18a-dev-kops-cicdenv-com" {
  availability_zone = "us-west-2a"
  encrypted         = true
  kms_key_id        = "arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9"
  size              = 20
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "a.etcd-events.1-18a-dev.kops.cicdenv.com"
    "k8s.io/etcd/events"                               = "a/a,b,c"
    "k8s.io/role/master"                               = "1"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  type = "gp2"
}

resource "aws_ebs_volume" "a-etcd-main-1-18a-dev-kops-cicdenv-com" {
  availability_zone = "us-west-2a"
  encrypted         = true
  kms_key_id        = "arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9"
  size              = 20
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "a.etcd-main.1-18a-dev.kops.cicdenv.com"
    "k8s.io/etcd/main"                                 = "a/a,b,c"
    "k8s.io/role/master"                               = "1"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  type = "gp2"
}

resource "aws_ebs_volume" "b-etcd-events-1-18a-dev-kops-cicdenv-com" {
  availability_zone = "us-west-2b"
  encrypted         = true
  kms_key_id        = "arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9"
  size              = 20
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "b.etcd-events.1-18a-dev.kops.cicdenv.com"
    "k8s.io/etcd/events"                               = "b/a,b,c"
    "k8s.io/role/master"                               = "1"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  type = "gp2"
}

resource "aws_ebs_volume" "b-etcd-main-1-18a-dev-kops-cicdenv-com" {
  availability_zone = "us-west-2b"
  encrypted         = true
  kms_key_id        = "arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9"
  size              = 20
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "b.etcd-main.1-18a-dev.kops.cicdenv.com"
    "k8s.io/etcd/main"                                 = "b/a,b,c"
    "k8s.io/role/master"                               = "1"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  type = "gp2"
}

resource "aws_ebs_volume" "c-etcd-events-1-18a-dev-kops-cicdenv-com" {
  availability_zone = "us-west-2c"
  encrypted         = true
  kms_key_id        = "arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9"
  size              = 20
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "c.etcd-events.1-18a-dev.kops.cicdenv.com"
    "k8s.io/etcd/events"                               = "c/a,b,c"
    "k8s.io/role/master"                               = "1"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  type = "gp2"
}

resource "aws_ebs_volume" "c-etcd-main-1-18a-dev-kops-cicdenv-com" {
  availability_zone = "us-west-2c"
  encrypted         = true
  kms_key_id        = "arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9"
  size              = 20
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "c.etcd-main.1-18a-dev.kops.cicdenv.com"
    "k8s.io/etcd/main"                                 = "c/a,b,c"
    "k8s.io/role/master"                               = "1"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  type = "gp2"
}

resource "aws_elb" "api-1-18a-dev-kops-cicdenv-com" {
  cross_zone_load_balancing = false
  health_check {
    healthy_threshold   = 2
    interval            = 10
    target              = "SSL:443"
    timeout             = 5
    unhealthy_threshold = 2
  }
  idle_timeout = 300
  internal     = true
  listener {
    instance_port      = 443
    instance_protocol  = "TCP"
    lb_port            = 443
    lb_protocol        = "TCP"
    ssl_certificate_id = ""
  }
  name            = "api-1-18a-dev-kops-cicden-kq68md"
  security_groups = [aws_security_group.api-elb-1-18a-dev-kops-cicdenv-com.id, "sg-05648b405f7e40884"]
  subnets         = ["subnet-0230bf84c0ca70352", "subnet-0d9574ecb5a5ac0ee", "subnet-0f984b3662e26d4e6"]
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "api.1-18a-dev.kops.cicdenv.com"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
}

resource "aws_key_pair" "kubernetes-1-18a-dev-kops-cicdenv-com-1ef6296b4cef0a1953af0bf62629a1bd" {
  key_name   = "kubernetes.1-18a-dev.kops.cicdenv.com-1e:f6:29:6b:4c:ef:0a:19:53:af:0b:f6:26:29:a1:bd"
  public_key = file("${path.module}/data/aws_key_pair_kubernetes.1-18a-dev.kops.cicdenv.com-1ef6296b4cef0a1953af0bf62629a1bd_public_key")
}

resource "aws_launch_configuration" "master-us-west-2a-masters-1-18a-dev-kops-cicdenv-com" {
  associate_public_ip_address = false
  enable_monitoring           = true
  ephemeral_block_device {
    device_name  = "/dev/sdc"
    virtual_name = "ephemeral0"
  }
  iam_instance_profile = "kops-master"
  image_id             = "ami-0a5a445878d40f4e8"
  instance_type        = "c5d.large"
  key_name             = aws_key_pair.kubernetes-1-18a-dev-kops-cicdenv-com-1ef6296b4cef0a1953af0bf62629a1bd.id
  lifecycle {
    create_before_destroy = true
  }
  name_prefix = "master-us-west-2a.masters.1-18a-dev.kops.cicdenv.com-"
  root_block_device {
    delete_on_termination = true
    volume_size           = 100
    volume_type           = "gp2"
  }
  security_groups = [aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, "sg-01b7dec4801ca3963"]
  user_data       = file("${path.module}/data/aws_launch_configuration_master-us-west-2a.masters.1-18a-dev.kops.cicdenv.com_user_data")
}

resource "aws_launch_configuration" "master-us-west-2b-masters-1-18a-dev-kops-cicdenv-com" {
  associate_public_ip_address = false
  enable_monitoring           = true
  ephemeral_block_device {
    device_name  = "/dev/sdc"
    virtual_name = "ephemeral0"
  }
  iam_instance_profile = "kops-master"
  image_id             = "ami-0a5a445878d40f4e8"
  instance_type        = "c5d.large"
  key_name             = aws_key_pair.kubernetes-1-18a-dev-kops-cicdenv-com-1ef6296b4cef0a1953af0bf62629a1bd.id
  lifecycle {
    create_before_destroy = true
  }
  name_prefix = "master-us-west-2b.masters.1-18a-dev.kops.cicdenv.com-"
  root_block_device {
    delete_on_termination = true
    volume_size           = 100
    volume_type           = "gp2"
  }
  security_groups = [aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, "sg-01b7dec4801ca3963"]
  user_data       = file("${path.module}/data/aws_launch_configuration_master-us-west-2b.masters.1-18a-dev.kops.cicdenv.com_user_data")
}

resource "aws_launch_configuration" "master-us-west-2c-masters-1-18a-dev-kops-cicdenv-com" {
  associate_public_ip_address = false
  enable_monitoring           = true
  ephemeral_block_device {
    device_name  = "/dev/sdc"
    virtual_name = "ephemeral0"
  }
  iam_instance_profile = "kops-master"
  image_id             = "ami-0a5a445878d40f4e8"
  instance_type        = "c5d.large"
  key_name             = aws_key_pair.kubernetes-1-18a-dev-kops-cicdenv-com-1ef6296b4cef0a1953af0bf62629a1bd.id
  lifecycle {
    create_before_destroy = true
  }
  name_prefix = "master-us-west-2c.masters.1-18a-dev.kops.cicdenv.com-"
  root_block_device {
    delete_on_termination = true
    volume_size           = 100
    volume_type           = "gp2"
  }
  security_groups = [aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id, "sg-01b7dec4801ca3963"]
  user_data       = file("${path.module}/data/aws_launch_configuration_master-us-west-2c.masters.1-18a-dev.kops.cicdenv.com_user_data")
}

resource "aws_launch_configuration" "nodes-us-west-2a-1-18a-dev-kops-cicdenv-com" {
  associate_public_ip_address = false
  enable_monitoring           = true
  ephemeral_block_device {
    device_name  = "/dev/sdc"
    virtual_name = "ephemeral0"
  }
  iam_instance_profile = "kops-node"
  image_id             = "ami-0a5a445878d40f4e8"
  instance_type        = "r5dn.xlarge"
  key_name             = aws_key_pair.kubernetes-1-18a-dev-kops-cicdenv-com-1ef6296b4cef0a1953af0bf62629a1bd.id
  lifecycle {
    create_before_destroy = true
  }
  name_prefix = "nodes-us-west-2a.1-18a-dev.kops.cicdenv.com-"
  root_block_device {
    delete_on_termination = true
    volume_size           = 100
    volume_type           = "gp2"
  }
  security_groups = [aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, "sg-085fcc2a8cfe305e5"]
  user_data       = file("${path.module}/data/aws_launch_configuration_nodes-us-west-2a.1-18a-dev.kops.cicdenv.com_user_data")
}

resource "aws_launch_configuration" "nodes-us-west-2b-1-18a-dev-kops-cicdenv-com" {
  associate_public_ip_address = false
  enable_monitoring           = true
  ephemeral_block_device {
    device_name  = "/dev/sdc"
    virtual_name = "ephemeral0"
  }
  iam_instance_profile = "kops-node"
  image_id             = "ami-0a5a445878d40f4e8"
  instance_type        = "r5dn.xlarge"
  key_name             = aws_key_pair.kubernetes-1-18a-dev-kops-cicdenv-com-1ef6296b4cef0a1953af0bf62629a1bd.id
  lifecycle {
    create_before_destroy = true
  }
  name_prefix = "nodes-us-west-2b.1-18a-dev.kops.cicdenv.com-"
  root_block_device {
    delete_on_termination = true
    volume_size           = 100
    volume_type           = "gp2"
  }
  security_groups = [aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, "sg-085fcc2a8cfe305e5"]
  user_data       = file("${path.module}/data/aws_launch_configuration_nodes-us-west-2b.1-18a-dev.kops.cicdenv.com_user_data")
}

resource "aws_launch_configuration" "nodes-us-west-2c-1-18a-dev-kops-cicdenv-com" {
  associate_public_ip_address = false
  enable_monitoring           = true
  ephemeral_block_device {
    device_name  = "/dev/sdc"
    virtual_name = "ephemeral0"
  }
  iam_instance_profile = "kops-node"
  image_id             = "ami-0a5a445878d40f4e8"
  instance_type        = "r5dn.xlarge"
  key_name             = aws_key_pair.kubernetes-1-18a-dev-kops-cicdenv-com-1ef6296b4cef0a1953af0bf62629a1bd.id
  lifecycle {
    create_before_destroy = true
  }
  name_prefix = "nodes-us-west-2c.1-18a-dev.kops.cicdenv.com-"
  root_block_device {
    delete_on_termination = true
    volume_size           = 100
    volume_type           = "gp2"
  }
  security_groups = [aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id, "sg-085fcc2a8cfe305e5"]
  user_data       = file("${path.module}/data/aws_launch_configuration_nodes-us-west-2c.1-18a-dev.kops.cicdenv.com_user_data")
}

resource "aws_route53_record" "api-1-18a-dev-kops-cicdenv-com" {
  alias {
    evaluate_target_health = false
    name                   = aws_elb.api-1-18a-dev-kops-cicdenv-com.dns_name
    zone_id                = aws_elb.api-1-18a-dev-kops-cicdenv-com.zone_id
  }
  name    = "api.1-18a-dev.kops.cicdenv.com"
  type    = "A"
  zone_id = "/hostedzone/Z09422972460V9DOBEONQ"
}

resource "aws_security_group_rule" "all-master-to-master" {
  from_port                = 0
  protocol                 = "-1"
  security_group_id        = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  source_security_group_id = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  to_port                  = 0
  type                     = "ingress"
}

resource "aws_security_group_rule" "all-master-to-node" {
  from_port                = 0
  protocol                 = "-1"
  security_group_id        = aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id
  source_security_group_id = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  to_port                  = 0
  type                     = "ingress"
}

resource "aws_security_group_rule" "all-node-to-node" {
  from_port                = 0
  protocol                 = "-1"
  security_group_id        = aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id
  source_security_group_id = aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id
  to_port                  = 0
  type                     = "ingress"
}

resource "aws_security_group_rule" "api-elb-egress" {
  cidr_blocks       = ["0.0.0.0/0"]
  from_port         = 0
  protocol          = "-1"
  security_group_id = aws_security_group.api-elb-1-18a-dev-kops-cicdenv-com.id
  to_port           = 0
  type              = "egress"
}

resource "aws_security_group_rule" "https-api-elb-0-0-0-0--0" {
  cidr_blocks       = ["0.0.0.0/0"]
  from_port         = 443
  protocol          = "tcp"
  security_group_id = aws_security_group.api-elb-1-18a-dev-kops-cicdenv-com.id
  to_port           = 443
  type              = "ingress"
}

resource "aws_security_group_rule" "https-elb-to-master" {
  from_port                = 443
  protocol                 = "tcp"
  security_group_id        = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  source_security_group_id = aws_security_group.api-elb-1-18a-dev-kops-cicdenv-com.id
  to_port                  = 443
  type                     = "ingress"
}

resource "aws_security_group_rule" "icmp-pmtu-api-elb-0-0-0-0--0" {
  cidr_blocks       = ["0.0.0.0/0"]
  from_port         = 3
  protocol          = "icmp"
  security_group_id = aws_security_group.api-elb-1-18a-dev-kops-cicdenv-com.id
  to_port           = 4
  type              = "ingress"
}

resource "aws_security_group_rule" "master-egress" {
  cidr_blocks       = ["0.0.0.0/0"]
  from_port         = 0
  protocol          = "-1"
  security_group_id = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  to_port           = 0
  type              = "egress"
}

resource "aws_security_group_rule" "node-egress" {
  cidr_blocks       = ["0.0.0.0/0"]
  from_port         = 0
  protocol          = "-1"
  security_group_id = aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id
  to_port           = 0
  type              = "egress"
}

resource "aws_security_group_rule" "node-to-master-tcp-1-2379" {
  from_port                = 1
  protocol                 = "tcp"
  security_group_id        = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  source_security_group_id = aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id
  to_port                  = 2379
  type                     = "ingress"
}

resource "aws_security_group_rule" "node-to-master-tcp-2382-4000" {
  from_port                = 2382
  protocol                 = "tcp"
  security_group_id        = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  source_security_group_id = aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id
  to_port                  = 4000
  type                     = "ingress"
}

resource "aws_security_group_rule" "node-to-master-tcp-4003-65535" {
  from_port                = 4003
  protocol                 = "tcp"
  security_group_id        = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  source_security_group_id = aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id
  to_port                  = 65535
  type                     = "ingress"
}

resource "aws_security_group_rule" "node-to-master-udp-1-65535" {
  from_port                = 1
  protocol                 = "udp"
  security_group_id        = aws_security_group.masters-1-18a-dev-kops-cicdenv-com.id
  source_security_group_id = aws_security_group.nodes-1-18a-dev-kops-cicdenv-com.id
  to_port                  = 65535
  type                     = "ingress"
}

resource "aws_security_group" "api-elb-1-18a-dev-kops-cicdenv-com" {
  description = "Security group for api ELB"
  name        = "api-elb.1-18a-dev.kops.cicdenv.com"
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "api-elb.1-18a-dev.kops.cicdenv.com"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  vpc_id = "vpc-05e996360e6d81043"
}

resource "aws_security_group" "masters-1-18a-dev-kops-cicdenv-com" {
  description = "Security group for masters"
  name        = "masters.1-18a-dev.kops.cicdenv.com"
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "masters.1-18a-dev.kops.cicdenv.com"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  vpc_id = "vpc-05e996360e6d81043"
}

resource "aws_security_group" "nodes-1-18a-dev-kops-cicdenv-com" {
  description = "Security group for nodes"
  name        = "nodes.1-18a-dev.kops.cicdenv.com"
  tags = {
    "KubernetesCluster"                                = "1-18a-dev.kops.cicdenv.com"
    "Name"                                             = "nodes.1-18a-dev.kops.cicdenv.com"
    "kubernetes.io/cluster/1-18a-dev.kops.cicdenv.com" = "owned"
  }
  vpc_id = "vpc-05e996360e6d81043"
}

terraform {
  required_version = ">= 0.12.0"
}

@fred-vogt
Copy link
Author

fred-vogt commented May 11, 2020

@johngmyers - I can look thru kops go sources Monday.

Running kops validate with -v 10 didn't have any relevant messages as far as I can tell.

I stopped all instances since I created this issue, just relaunched them - same result.

validate output still has empty node status table - cluster seems fine otherwise.

KUBECONFIG=... kops validate cluster --name=1-18a-dev.kops.cicdenv.com --state=s3://kops.cicdenv.com
Validating cluster 1-18a-dev.kops.cicdenv.com

INSTANCE GROUPS
NAME			ROLE	MACHINETYPE	MIN	MAX	SUBNETS
master-us-west-2a	Master	c5d.large	1	1	us-west-2a
master-us-west-2b	Master	c5d.large	1	1	us-west-2b
master-us-west-2c	Master	c5d.large	1	1	us-west-2c
nodes-us-west-2a	Node	r5dn.xlarge	1	1	us-west-2a
nodes-us-west-2b	Node	r5dn.xlarge	1	1	us-west-2b
nodes-us-west-2c	Node	r5dn.xlarge	1	1	us-west-2c

NODE STATUS
NAME	ROLE	READY

Your cluster 1-18a-dev.kops.cicdenv.com is ready

@fred-vogt
Copy link
Author

fred-vogt commented May 11, 2020

I looked at the code a little just now:

The "NODE STATUS" table printing is in the cmd module:

The value items in ValidationCluster.Nodes seems to be set in ValidationCluster::validateNodes():

@fred-vogt
Copy link
Author

Cluster manifest:

---
apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  name: 1-18a-dev.kops.cicdenv.com
spec:
  addons:
  - manifest: s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/addons/custom-channel.yaml
  api:
    loadBalancer:
      type: Internal
      additionalSecurityGroups: [sg-05648b405f7e40884]
  authentication:
    aws:
      image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-iam-authenticator:v0.4.0
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws
  configBase: s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com
  dnsZone: Z09422972460V9DOBEONQ
  etcdClusters:
  - etcdMembers:
    - instanceGroup: master-us-west-2a
      name: a
      encryptedVolume: true
      kmsKeyId: arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9
    - instanceGroup: master-us-west-2b
      name: b
      encryptedVolume: true
      kmsKeyId: arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9
    - instanceGroup: master-us-west-2c
      name: c
      encryptedVolume: true
      kmsKeyId: arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9
    name: main
  - etcdMembers:
    - instanceGroup: master-us-west-2a
      name: a
      encryptedVolume: true
      kmsKeyId: arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9
    - instanceGroup: master-us-west-2b
      name: b
      encryptedVolume: true
      kmsKeyId: arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9
    - instanceGroup: master-us-west-2c
      name: c
      encryptedVolume: true
      kmsKeyId: arn:aws:kms:us-west-2:977594567050:key/40375c75-582c-4572-94cb-c6fd87e5bfa9
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubeDNS:
    provider: CoreDNS
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubelet:
    anonymousAuth: false
  kubernetesVersion: 1.18.2
  masterInternalName: api.internal.1-18a-dev.kops.cicdenv.com
  masterPublicName: api.1-18a-dev.kops.cicdenv.com
  networkCIDR: 10.16.0.0/16
  networkID: vpc-05e996360e6d81043
  networking:
    canal: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  subnets:
  - cidr: 10.16.32.0/19
    id: subnet-0230bf84c0ca70352
    name: us-west-2a
    type: Private
    zone: us-west-2a
  - cidr: 10.16.64.0/19
    id: subnet-0d9574ecb5a5ac0ee
    name: us-west-2b
    type: Private
    zone: us-west-2b
  - cidr: 10.16.96.0/19
    id: subnet-0f984b3662e26d4e6
    name: us-west-2c
    type: Private
    zone: us-west-2c
  - cidr: 10.16.0.0/22
    id: subnet-0fe6e7b15f31088c5
    name: utility-us-west-2a
    type: Utility
    zone: us-west-2a
  - cidr: 10.16.4.0/22
    id: subnet-094d995cd873d3728
    name: utility-us-west-2b
    type: Utility
    zone: us-west-2b
  - cidr: 10.16.8.0/22
    id: subnet-0af21c86ed70071c8
    name: utility-us-west-2c
    type: Utility
    zone: us-west-2c
  topology:
    dns:
      type: Private
    masters: private
    nodes: private
  docker:
    version: "19.03.8"

---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: 1-18a-dev.kops.cicdenv.com
  name: master-us-west-2a
spec:
  image: ami-0a5a445878d40f4e8
  machineType: c5d.large
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-west-2a
  role: Master
  rootVolumeSize: 100
  iam:
    profile: arn:aws:iam::977594567050:instance-profile/kops-master
  additionalSecurityGroups: [sg-01b7dec4801ca3963]
  subnets:
  - us-west-2a
  cloudLabels:

  detailedInstanceMonitoring: true
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: 1-18a-dev.kops.cicdenv.com
  name: master-us-west-2b
spec:
  image: ami-0a5a445878d40f4e8
  machineType: c5d.large
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-west-2b
  role: Master
  rootVolumeSize: 100
  iam:
    profile: arn:aws:iam::977594567050:instance-profile/kops-master
  additionalSecurityGroups: [sg-01b7dec4801ca3963]
  subnets:
  - us-west-2b
  cloudLabels:

  detailedInstanceMonitoring: true
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: 1-18a-dev.kops.cicdenv.com
  name: master-us-west-2c
spec:
  image: ami-0a5a445878d40f4e8
  machineType: c5d.large
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-west-2c
  role: Master
  rootVolumeSize: 100
  iam:
    profile: arn:aws:iam::977594567050:instance-profile/kops-master
  additionalSecurityGroups: [sg-01b7dec4801ca3963]
  subnets:
  - us-west-2c
  cloudLabels:

  detailedInstanceMonitoring: true
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: 1-18a-dev.kops.cicdenv.com
  name: nodes-us-west-2a
spec:
  image: ami-0a5a445878d40f4e8
  machineType: r5dn.xlarge
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-us-west-2a
  role: Node
  rootVolumeSize: 100
  iam:
    profile: arn:aws:iam::977594567050:instance-profile/kops-node
  additionalSecurityGroups: [sg-085fcc2a8cfe305e5]
  subnets:
  - us-west-2a
  cloudLabels:

  detailedInstanceMonitoring: true
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: 1-18a-dev.kops.cicdenv.com
  name: nodes-us-west-2b
spec:
  image: ami-0a5a445878d40f4e8
  machineType: r5dn.xlarge
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-us-west-2b
  role: Node
  rootVolumeSize: 100
  iam:
    profile: arn:aws:iam::977594567050:instance-profile/kops-node
  additionalSecurityGroups: [sg-085fcc2a8cfe305e5]
  subnets:
  - us-west-2b
  cloudLabels:

  detailedInstanceMonitoring: true
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: 1-18a-dev.kops.cicdenv.com
  name: nodes-us-west-2c
spec:
  image: ami-0a5a445878d40f4e8
  machineType: r5dn.xlarge
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-us-west-2c
  role: Node
  rootVolumeSize: 100
  iam:
    profile: arn:aws:iam::977594567050:instance-profile/kops-node
  additionalSecurityGroups: [sg-085fcc2a8cfe305e5]
  subnets:
  - us-west-2c
  cloudLabels:

  detailedInstanceMonitoring: true

@fred-vogt
Copy link
Author

fred-vogt commented May 11, 2020

Debug output - without the payloads from kube API (which are very long)

  • upup/pkg/fi/cloudup/awsup/aws_cloud.go#L538
    • ... aws_cloud.go:538] Listing all Autoscaling groups matching cluster tags
kops validate cluster --name=1-18a-dev.kops.cicdenv.com --state=s3://kops.cicdenv.com -v 10
I0511 00:55:46.584500      12 factory.go:68] state store s3://kops.cicdenv.com
...
I0511 00:55:47.057497      12 s3context.go:210] found bucket in region "us-west-2"
I0511 00:55:47.063907      12 s3fs.go:219] Reading file "s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/config"
I0511 00:55:47.290576      12 aws_cloud.go:1312] Querying EC2 for all valid zones in region "us-west-2"
I0511 00:55:47.291125      12 request_logger.go:45] AWS request: ec2/DescribeAvailabilityZones
I0511 00:55:47.439608      12 s3fs.go:256] Listing objects in S3 bucket "kops.cicdenv.com" with prefix "1-18a-dev.kops.cicdenv.com/instancegroup/"
I0511 00:55:47.569884      12 s3fs.go:284] Listed files in s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup: [s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/master-us-west-2a s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/master-us-west-2b s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/master-us-west-2c s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/nodes-us-west-2a s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/nodes-us-west-2b s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/nodes-us-west-2c]
I0511 00:55:47.569972      12 s3fs.go:219] Reading file "s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/master-us-west-2a"
I0511 00:55:47.657104      12 s3fs.go:219] Reading file "s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/master-us-west-2b"
I0511 00:55:47.766475      12 s3fs.go:219] Reading file "s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/master-us-west-2c"
I0511 00:55:47.807651      12 s3fs.go:219] Reading file "s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/nodes-us-west-2a"
I0511 00:55:47.902113      12 s3fs.go:219] Reading file "s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/nodes-us-west-2b"
I0511 00:55:47.990313      12 s3fs.go:219] Reading file "s3://kops.cicdenv.com/1-18a-dev.kops.cicdenv.com/instancegroup/nodes-us-west-2c"
Validating cluster 1-18a-dev.kops.cicdenv.com

I0511 00:55:48.071384      12 validate_cluster.go:119] instance group: kops.InstanceGroupSpec{Role:"Master", Image:"ami-0a5a445878d40f4e8", MinSize:(*int32)(0xc0007df63c), MaxSize:(*int32)(0xc0007df630), MachineType:"c5d.large", RootVolumeSize:(*int32)(0xc0007df658), RootVolumeType:(*string)(nil), RootVolumeIops:(*int32)(nil), RootVolumeOptimization:(*bool)(nil), RootVolumeDeleteOnTermination:(*bool)(nil), Volumes:[]*kops.VolumeSpec(nil), VolumeMounts:[]*kops.VolumeMountSpec(nil), Subnets:[]string{"us-west-2a"}, Zones:[]string(nil), Hooks:[]kops.HookSpec(nil), MaxPrice:(*string)(nil), SpotDurationInMinutes:(*int64)(nil), AssociatePublicIP:(*bool)(nil), AdditionalSecurityGroups:[]string{"sg-01b7dec4801ca3963"}, CloudLabels:map[string]string(nil), NodeLabels:map[string]string{"kops.k8s.io/instancegroup":"master-us-west-2a"}, FileAssets:[]kops.FileAssetSpec(nil), Tenancy:"", Kubelet:(*kops.KubeletConfigSpec)(nil), Taints:[]string(nil), MixedInstancesPolicy:(*kops.MixedInstancesPolicySpec)(nil), AdditionalUserData:[]kops.UserData(nil), SuspendProcesses:[]string(nil), ExternalLoadBalancers:[]kops.LoadBalancer(nil), DetailedInstanceMonitoring:(*bool)(0xc0007df60c), IAM:(*kops.IAMProfileSpec)(0xc000572da8), SecurityGroupOverride:(*string)(nil), InstanceProtection:(*bool)(nil), SysctlParameters:[]string(nil), RollingUpdate:(*kops.RollingUpdate)(nil)}

I0511 00:55:48.071478      12 validate_cluster.go:119] instance group: kops.InstanceGroupSpec{Role:"Master", Image:"ami-0a5a445878d40f4e8", MinSize:(*int32)(0xc00081458c), MaxSize:(*int32)(0xc000814580), MachineType:"c5d.large", RootVolumeSize:(*int32)(0xc0008145a8), RootVolumeType:(*string)(nil), RootVolumeIops:(*int32)(nil), RootVolumeOptimization:(*bool)(nil), RootVolumeDeleteOnTermination:(*bool)(nil), Volumes:[]*kops.VolumeSpec(nil), VolumeMounts:[]*kops.VolumeMountSpec(nil), Subnets:[]string{"us-west-2b"}, Zones:[]string(nil), Hooks:[]kops.HookSpec(nil), MaxPrice:(*string)(nil), SpotDurationInMinutes:(*int64)(nil), AssociatePublicIP:(*bool)(nil), AdditionalSecurityGroups:[]string{"sg-01b7dec4801ca3963"}, CloudLabels:map[string]string(nil), NodeLabels:map[string]string{"kops.k8s.io/instancegroup":"master-us-west-2b"}, FileAssets:[]kops.FileAssetSpec(nil), Tenancy:"", Kubelet:(*kops.KubeletConfigSpec)(nil), Taints:[]string(nil), MixedInstancesPolicy:(*kops.MixedInstancesPolicySpec)(nil), AdditionalUserData:[]kops.UserData(nil), SuspendProcesses:[]string(nil), ExternalLoadBalancers:[]kops.LoadBalancer(nil), DetailedInstanceMonitoring:(*bool)(0xc000814554), IAM:(*kops.IAMProfileSpec)(0xc00000e4f0), SecurityGroupOverride:(*string)(nil), InstanceProtection:(*bool)(nil), SysctlParameters:[]string(nil), RollingUpdate:(*kops.RollingUpdate)(nil)}

I0511 00:55:48.071619      12 validate_cluster.go:119] instance group: kops.InstanceGroupSpec{Role:"Master", Image:"ami-0a5a445878d40f4e8", MinSize:(*int32)(0xc000814c1c), MaxSize:(*int32)(0xc000814c10), MachineType:"c5d.large", RootVolumeSize:(*int32)(0xc000814c38), RootVolumeType:(*string)(nil), RootVolumeIops:(*int32)(nil), RootVolumeOptimization:(*bool)(nil), RootVolumeDeleteOnTermination:(*bool)(nil), Volumes:[]*kops.VolumeSpec(nil), VolumeMounts:[]*kops.VolumeMountSpec(nil), Subnets:[]string{"us-west-2c"}, Zones:[]string(nil), Hooks:[]kops.HookSpec(nil), MaxPrice:(*string)(nil), SpotDurationInMinutes:(*int64)(nil), AssociatePublicIP:(*bool)(nil), AdditionalSecurityGroups:[]string{"sg-01b7dec4801ca3963"}, CloudLabels:map[string]string(nil), NodeLabels:map[string]string{"kops.k8s.io/instancegroup":"master-us-west-2c"}, FileAssets:[]kops.FileAssetSpec(nil), Tenancy:"", Kubelet:(*kops.KubeletConfigSpec)(nil), Taints:[]string(nil), MixedInstancesPolicy:(*kops.MixedInstancesPolicySpec)(nil), AdditionalUserData:[]kops.UserData(nil), SuspendProcesses:[]string(nil), ExternalLoadBalancers:[]kops.LoadBalancer(nil), DetailedInstanceMonitoring:(*bool)(0xc000814bd4), IAM:(*kops.IAMProfileSpec)(0xc00000e670), SecurityGroupOverride:(*string)(nil), InstanceProtection:(*bool)(nil), SysctlParameters:[]string(nil), RollingUpdate:(*kops.RollingUpdate)(nil)}

I0511 00:55:48.071841      12 validate_cluster.go:119] instance group: kops.InstanceGroupSpec{Role:"Node", Image:"ami-0a5a445878d40f4e8", MinSize:(*int32)(0xc000834198), MaxSize:(*int32)(0xc000834188), MachineType:"r5dn.xlarge", RootVolumeSize:(*int32)(0xc0008341c4), RootVolumeType:(*string)(nil), RootVolumeIops:(*int32)(nil), RootVolumeOptimization:(*bool)(nil), RootVolumeDeleteOnTermination:(*bool)(nil), Volumes:[]*kops.VolumeSpec(nil), VolumeMounts:[]*kops.VolumeMountSpec(nil), Subnets:[]string{"us-west-2a"}, Zones:[]string(nil), Hooks:[]kops.HookSpec(nil), MaxPrice:(*string)(nil), SpotDurationInMinutes:(*int64)(nil), AssociatePublicIP:(*bool)(nil), AdditionalSecurityGroups:[]string{"sg-085fcc2a8cfe305e5"}, CloudLabels:map[string]string(nil), NodeLabels:map[string]string{"kops.k8s.io/instancegroup":"nodes-us-west-2a"}, FileAssets:[]kops.FileAssetSpec(nil), Tenancy:"", Kubelet:(*kops.KubeletConfigSpec)(nil), Taints:[]string(nil), MixedInstancesPolicy:(*kops.MixedInstancesPolicySpec)(nil), AdditionalUserData:[]kops.UserData(nil), SuspendProcesses:[]string(nil), ExternalLoadBalancers:[]kops.LoadBalancer(nil), DetailedInstanceMonitoring:(*bool)(0xc000834144), IAM:(*kops.IAMProfileSpec)(0xc0001f4240), SecurityGroupOverride:(*string)(nil), InstanceProtection:(*bool)(nil), SysctlParameters:[]string(nil), RollingUpdate:(*kops.RollingUpdate)(nil)}

I0511 00:55:48.071944      12 validate_cluster.go:119] instance group: kops.InstanceGroupSpec{Role:"Node", Image:"ami-0a5a445878d40f4e8", MinSize:(*int32)(0xc000815608), MaxSize:(*int32)(0xc0008155f8), MachineType:"r5dn.xlarge", RootVolumeSize:(*int32)(0xc000815644), RootVolumeType:(*string)(nil), RootVolumeIops:(*int32)(nil), RootVolumeOptimization:(*bool)(nil), RootVolumeDeleteOnTermination:(*bool)(nil), Volumes:[]*kops.VolumeSpec(nil), VolumeMounts:[]*kops.VolumeMountSpec(nil), Subnets:[]string{"us-west-2b"}, Zones:[]string(nil), Hooks:[]kops.HookSpec(nil), MaxPrice:(*string)(nil), SpotDurationInMinutes:(*int64)(nil), AssociatePublicIP:(*bool)(nil), AdditionalSecurityGroups:[]string{"sg-085fcc2a8cfe305e5"}, CloudLabels:map[string]string(nil), NodeLabels:map[string]string{"kops.k8s.io/instancegroup":"nodes-us-west-2b"}, FileAssets:[]kops.FileAssetSpec(nil), Tenancy:"", Kubelet:(*kops.KubeletConfigSpec)(nil), Taints:[]string(nil), MixedInstancesPolicy:(*kops.MixedInstancesPolicySpec)(nil), AdditionalUserData:[]kops.UserData(nil), SuspendProcesses:[]string(nil), ExternalLoadBalancers:[]kops.LoadBalancer(nil), DetailedInstanceMonitoring:(*bool)(0xc0008155b4), IAM:(*kops.IAMProfileSpec)(0xc00000e6d8), SecurityGroupOverride:(*string)(nil), InstanceProtection:(*bool)(nil), SysctlParameters:[]string(nil), RollingUpdate:(*kops.RollingUpdate)(nil)}

I0511 00:55:48.072035      12 validate_cluster.go:119] instance group: kops.InstanceGroupSpec{Role:"Node", Image:"ami-0a5a445878d40f4e8", MinSize:(*int32)(0xc000834bb8), MaxSize:(*int32)(0xc000834b98), MachineType:"r5dn.xlarge", RootVolumeSize:(*int32)(0xc000834be4), RootVolumeType:(*string)(nil), RootVolumeIops:(*int32)(nil), RootVolumeOptimization:(*bool)(nil), RootVolumeDeleteOnTermination:(*bool)(nil), Volumes:[]*kops.VolumeSpec(nil), VolumeMounts:[]*kops.VolumeMountSpec(nil), Subnets:[]string{"us-west-2c"}, Zones:[]string(nil), Hooks:[]kops.HookSpec(nil), MaxPrice:(*string)(nil), SpotDurationInMinutes:(*int64)(nil), AssociatePublicIP:(*bool)(nil), AdditionalSecurityGroups:[]string{"sg-085fcc2a8cfe305e5"}, CloudLabels:map[string]string(nil), NodeLabels:map[string]string{"kops.k8s.io/instancegroup":"nodes-us-west-2c"}, FileAssets:[]kops.FileAssetSpec(nil), Tenancy:"", Kubelet:(*kops.KubeletConfigSpec)(nil), Taints:[]string(nil), MixedInstancesPolicy:(*kops.MixedInstancesPolicySpec)(nil), AdditionalUserData:[]kops.UserData(nil), SuspendProcesses:[]string(nil), ExternalLoadBalancers:[]kops.LoadBalancer(nil), DetailedInstanceMonitoring:(*bool)(0xc000834b34), IAM:(*kops.IAMProfileSpec)(0xc0001f42a0), SecurityGroupOverride:(*string)(nil), InstanceProtection:(*bool)(nil), SysctlParameters:[]string(nil), RollingUpdate:(*kops.RollingUpdate)(nil)}

I0511 00:55:48.082240      12 loader.go:375] Config loaded from file:  /home/terraform/cicdenv/terraform/kops/clusters/1-18a/cluster/dev/kops-admin.kubeconfig
I0511 00:55:48.086603      12 loader.go:375] Config loaded from file:  /home/terraform/cicdenv/terraform/kops/clusters/1-18a/cluster/dev/kops-admin.kubeconfig
I0511 00:55:48.089139      12 round_trippers.go:423] curl -k -v -XGET  -H "User-Agent: kops/v0.0.0 (linux/amd64) kubernetes/$Format" -H "Accept: application/json, */*" -H "Authorization: Basic YWRtaW46aGlJNUZRc21hU0p6eXRIYmUxSUt1elp0QTVGcVUybkk=" 'https://api.1-18a-dev.kops.cicdenv.com/api/v1/nodes'
I0511 00:55:48.179507      12 round_trippers.go:443] GET https://api.1-18a-dev.kops.cicdenv.com/api/v1/nodes 200 OK in 90 milliseconds
I0511 00:55:48.179688      12 round_trippers.go:449] Response Headers:
I0511 00:55:48.179849      12 round_trippers.go:452]     Audit-Id: d939ecdb-f5cc-425f-a481-0889fd16a0f6
I0511 00:55:48.179927      12 round_trippers.go:452]     Content-Type: application/json
I0511 00:55:48.180006      12 round_trippers.go:452]     Date: Mon, 11 May 2020 07:55:48 GMT
...
I0511 00:55:48.231822      12 aws_cloud.go:538] Listing all Autoscaling groups matching cluster tags
I0511 00:55:48.232089      12 request_logger.go:45] AWS request: autoscaling/DescribeTags
...

@fred-vogt
Copy link
Author

This is 100% reproducible for this cluster.
Going to spin it down for now.

I'll have to create a custom kops build with extra logging to determine what is going on.

@johngmyers
Copy link
Member

Now #9118 has merged, it would be helpful if you could try with a build off of current master to verify you're now getting the new validation failures. If so, the question then becomes why FindAutoscalingGroups isn't returning any ASGs.

@fred-vogt
Copy link
Author

@johngmyers - Thanks. I'll test with master tomorrow after walking the dog.

@fred-vogt
Copy link
Author

Hm... today I turned the cluster back on and now it the node status table populates:

kops version
Version 1.18.0-alpha.3 (git-27aab12b2

kops validate cluster --name=... --state=s3://...
Validating cluster ...

INSTANCE GROUPS
NAME			ROLE	MACHINETYPE	MIN	MAX	SUBNETS
master-us-west-2a	Master	c5d.large	1	1	us-west-2a
master-us-west-2b	Master	c5d.large	1	1	us-west-2b
master-us-west-2c	Master	c5d.large	1	1	us-west-2c
nodes-us-west-2a	Node	r5dn.xlarge	1	1	us-west-2a
nodes-us-west-2b	Node	r5dn.xlarge	1	1	us-west-2b
nodes-us-west-2c	Node	r5dn.xlarge	1	1	us-west-2c

NODE STATUS
NAME						ROLE	READY
ip-10-16-104-142.us-west-2.compute.internal	master	True
ip-10-16-110-69.us-west-2.compute.internal	node	True
ip-10-16-36-160.us-west-2.compute.internal	master	True
ip-10-16-41-207.us-west-2.compute.internal	node	True
ip-10-16-67-175.us-west-2.compute.internal	master	True
ip-10-16-84-184.us-west-2.compute.internal	node	True

Your cluster ... is ready

@johngmyers - I have a master build ready to test with your updates as well.

This might have something to do with using an "admin" kubeconfig or a "user" kubeconfig (aws-iam-authenticator).
I'll try creating some new clusters and update this again.

@fred-vogt
Copy link
Author

@johngmyers - closing for now. I can't reproduce this anymore. If I see it again I'll test with a build of master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants