static pods not visible in kubectl get pods #9937

zetaab · 2020-09-15T07:52:25Z

1. What kops version are you running? The command kops version, will display
this information.

master

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

1.18.3 (tried also using 1.19.1 but still those are not visible)

3. What cloud provider are you using?

openstack / aws

4. What commands did you run? What is the simplest way to reproduce this issue?

I am trying to update clusters from 1.18.8 -> 1.19.1 using latest kops master. After I terminated each master one by one, I cannot see kubernetes critical components anymore in kubectl get pods -n kube-system

5. What happened after the commands executed?

% kubectl get pods
NAME                                        READY   STATUS    RESTARTS   AGE
audit-webhook-deployment-2jsrh              1/1     Running   0          49m
audit-webhook-deployment-9wl6z              1/1     Running   0          10m
audit-webhook-deployment-rjjg9              1/1     Running   0          22m
calico-kube-controllers-5495b6bf54-flj4l    1/1     Running   0          14h
calico-node-f9b9g                           1/1     Running   0          14h
calico-node-hv6gd                           1/1     Running   0          11m
calico-node-n8frz                           1/1     Running   0          49m
calico-node-sfjpf                           1/1     Running   0          23m
calico-node-v4vtb                           1/1     Running   0          14h
calico-node-wf56h                           1/1     Running   0          14h
coredns-579dc57f59-54gf4                    1/1     Running   0          14h
coredns-579dc57f59-jd4hz                    1/1     Running   0          14h
coredns-autoscaler-6cc7676775-h2n2m         1/1     Running   0          14h
csi-cinder-controllerplugin-0               5/5     Running   0          14h
csi-cinder-nodeplugin-5p2tb                 2/2     Running   0          14h
csi-cinder-nodeplugin-89qdz                 2/2     Running   0          14h
csi-cinder-nodeplugin-c778f                 2/2     Running   0          23m
csi-cinder-nodeplugin-dk4c6                 2/2     Running   0          14h
csi-cinder-nodeplugin-qm8nn                 2/2     Running   0          49m
csi-cinder-nodeplugin-xmwnw                 2/2     Running   0          11m
dns-controller-5bf6f8c946-psq5c             1/1     Running   0          14m
falco-audit-deployment-85f89f589-krk97      1/1     Running   0          7d
kops-autoscaler-openstack-c578b6fb5-b86hp   1/1     Running   4          14h
kops-controller-hxz67                       1/1     Running   1          49m
kops-controller-qhg8p                       1/1     Running   0          22m
kops-controller-v2xdd                       1/1     Running   0          10m
kube-proxy-nodes-z1-1-rofa1-k8s-local       1/1     Running   0          23d
kube-proxy-nodes-z2-1-rofa1-k8s-local       1/1     Running   0          7d
kube-proxy-nodes-z3-1-rofa1-k8s-local       1/1     Running   0          5d
modsec-audit-deployment-c4fdc5b5-djxs8      1/1     Running   0          14h
namespace-controller-6646b86568-s8qgx       1/1     Running   0          7d
networkpolicy-controller-6bc596f89d-gw4nl   1/1     Running   0          5d
openstack-cloud-provider-htzkn              1/1     Running   0          23m
openstack-cloud-provider-tdnj7              1/1     Running   0          11m
openstack-cloud-provider-zn7cn              1/1     Running   1          49m

If I try to execute rolling update the kops cluster validation will fail

0915 07:44:37.374604      93 instancegroups.go:442] Cluster did not pass validation, will retry in "30s": master "master-zone-3-1-1-rofa1-k8s-local" is missing kube-apiserver pod, master "master-zone-3-1-1-rofa1-k8s-local" is missing kube-controller-manager pod, master "master-zone-3-1-1-rofa1-k8s-local" is missing kube-scheduler pod, master "master-zone-1-1-1-rofa1-k8s-local" is missing kube-apiserver pod, master "master-zone-1-1-1-rofa1-k8s-local" is missing kube-controller-manager pod, master "master-zone-1-1-1-rofa1-k8s-local" is missing kube-scheduler pod, master "master-zone-2-1-1-rofa1-k8s-local" is missing kube-apiserver pod, master "master-zone-2-1-1-rofa1-k8s-local" is missing kube-controller-manager pod, master "master-zone-2-1-1-rofa1-k8s-local" is missing kube-scheduler pod.

For me this looks like all pods are missing that are located in /etc/kubernetes/manifests in each master

6. What did you expect to happen?

I expect that all pods are visible under kube-system

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

The text was updated successfully, but these errors were encountered:

zetaab · 2020-09-15T08:15:01Z

https://gist.github.com/zetaab/24f8d1402a11123a09fc35fe4168ec55

so static pods are there (apiserver, scheduled, controller-manager, etcds..), but those are not visible in API

We also see this happening in all new clusters that we create automatically in our e2e tests. If cluster is created with 1.17 / 1.18 and updated to 1.19 the static pods will be missing from the API

This same happens also in AWS (we do execute e2e tests there as well)

I0915 06:53:16.351163 70 instancegroups.go:442] Cluster did not pass validation, will retry in "30s": master "ip-172-20-35-23.eu-north-1.compute.internal" is missing kube-apiserver pod, master "ip-172-20-35-23.eu-north-1.compute.internal" is missing kube-controller-manager pod, master "ip-172-20-35-23.eu-north-1.compute.internal" is missing kube-scheduler pod.

zetaab · 2020-09-15T08:33:20Z

I can see this in logs:

Sep 15 08:31:35 master-zone-1-1-1-rofa1-k8s-local kubelet[3315]: E0915 08:31:35.760761 3315 kubelet.go:1576] Failed creating a mirror pod for "etcd-manager-main-master-zone-1-1-1-rofa1-k8s-local_kube-system(7974b24d667835b08eceeb7e48c06d7c)": pods "etcd-manager-main-master-zone-1-1-1-rofa1-k8s-local" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.securityContext.hostPID: Invalid value: true: Host PID is not allowed to be used spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]

so it looks like that if we have PSP turned on, static pods will not be visible in kubectl get pods -n kube-system

zetaab · 2020-09-15T08:40:44Z

This do not affect all kops users, only those ones who are using PSPs. Something wrong with our default rules in https://github.com/kubernetes/kops/blob/master/upup/models/cloudup/resources/addons/podsecuritypolicy.addons.k8s.io/k8s-1.12.yaml.template

When going through that all kube-system serviceaccounts should have psp already:

https://github.com/kubernetes/kops/blob/master/upup/models/cloudup/resources/addons/podsecuritypolicy.addons.k8s.io/k8s-1.12.yaml.template#L67-L82

https://github.com/kubernetes/kops/blob/master/upup/models/cloudup/resources/addons/podsecuritypolicy.addons.k8s.io/k8s-1.12.yaml.template#L30-L43

zetaab · 2020-09-15T08:51:23Z

reproduce instructions (cluster can be created to OpenStack also):

~/go/bin/kops create cluster \
  --cloud aws \
  --name testing.k8s.local \
  --state ${KOPS_STATE_STORE} \
  --zones=eu-north-1a \
  --master-count=1 \
  --node-count=1

~/go/bin/kops edit cluster testing.k8s.local

add following:

  kubeAPIServer:
    appendAdmissionPlugins:
    - PodSecurityPolicy

~/go/bin/kops update cluster testing.k8s.local --yes

then wait few minutes

zetaab added the kind/bug Categorizes issue or PR as related to a bug. label Sep 15, 2020

zetaab added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Sep 15, 2020

zetaab changed the title ~~kube-system critical pods not visible in kubectl get pods~~ static pods not visible in kubectl get pods Sep 15, 2020

zetaab added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Sep 15, 2020

zetaab mentioned this issue Sep 15, 2020

add kube-system psp to system:nodes #9941

Merged

k8s-ci-robot closed this as completed in #9941 Sep 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

static pods not visible in kubectl get pods #9937

static pods not visible in kubectl get pods #9937

zetaab commented Sep 15, 2020 •

edited

Loading

zetaab commented Sep 15, 2020 •

edited

Loading

zetaab commented Sep 15, 2020

zetaab commented Sep 15, 2020 •

edited

Loading

zetaab commented Sep 15, 2020 •

edited

Loading

static pods not visible in kubectl get pods #9937

static pods not visible in kubectl get pods #9937

Comments

zetaab commented Sep 15, 2020 • edited Loading

zetaab commented Sep 15, 2020 • edited Loading

zetaab commented Sep 15, 2020

zetaab commented Sep 15, 2020 • edited Loading

zetaab commented Sep 15, 2020 • edited Loading

zetaab commented Sep 15, 2020 •

edited

Loading

zetaab commented Sep 15, 2020 •

edited

Loading

zetaab commented Sep 15, 2020 •

edited

Loading

zetaab commented Sep 15, 2020 •

edited

Loading