Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

static pods not visible in kubectl get pods #9937

Closed
zetaab opened this issue Sep 15, 2020 · 4 comments · Fixed by #9941
Closed

static pods not visible in kubectl get pods #9937

zetaab opened this issue Sep 15, 2020 · 4 comments · Fixed by #9941
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@zetaab
Copy link
Member

zetaab commented Sep 15, 2020

1. What kops version are you running? The command kops version, will display
this information.

master

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

1.18.3 (tried also using 1.19.1 but still those are not visible)

3. What cloud provider are you using?

openstack / aws

4. What commands did you run? What is the simplest way to reproduce this issue?

I am trying to update clusters from 1.18.8 -> 1.19.1 using latest kops master. After I terminated each master one by one, I cannot see kubernetes critical components anymore in kubectl get pods -n kube-system

5. What happened after the commands executed?

% kubectl get pods
NAME                                        READY   STATUS    RESTARTS   AGE
audit-webhook-deployment-2jsrh              1/1     Running   0          49m
audit-webhook-deployment-9wl6z              1/1     Running   0          10m
audit-webhook-deployment-rjjg9              1/1     Running   0          22m
calico-kube-controllers-5495b6bf54-flj4l    1/1     Running   0          14h
calico-node-f9b9g                           1/1     Running   0          14h
calico-node-hv6gd                           1/1     Running   0          11m
calico-node-n8frz                           1/1     Running   0          49m
calico-node-sfjpf                           1/1     Running   0          23m
calico-node-v4vtb                           1/1     Running   0          14h
calico-node-wf56h                           1/1     Running   0          14h
coredns-579dc57f59-54gf4                    1/1     Running   0          14h
coredns-579dc57f59-jd4hz                    1/1     Running   0          14h
coredns-autoscaler-6cc7676775-h2n2m         1/1     Running   0          14h
csi-cinder-controllerplugin-0               5/5     Running   0          14h
csi-cinder-nodeplugin-5p2tb                 2/2     Running   0          14h
csi-cinder-nodeplugin-89qdz                 2/2     Running   0          14h
csi-cinder-nodeplugin-c778f                 2/2     Running   0          23m
csi-cinder-nodeplugin-dk4c6                 2/2     Running   0          14h
csi-cinder-nodeplugin-qm8nn                 2/2     Running   0          49m
csi-cinder-nodeplugin-xmwnw                 2/2     Running   0          11m
dns-controller-5bf6f8c946-psq5c             1/1     Running   0          14m
falco-audit-deployment-85f89f589-krk97      1/1     Running   0          7d
kops-autoscaler-openstack-c578b6fb5-b86hp   1/1     Running   4          14h
kops-controller-hxz67                       1/1     Running   1          49m
kops-controller-qhg8p                       1/1     Running   0          22m
kops-controller-v2xdd                       1/1     Running   0          10m
kube-proxy-nodes-z1-1-rofa1-k8s-local       1/1     Running   0          23d
kube-proxy-nodes-z2-1-rofa1-k8s-local       1/1     Running   0          7d
kube-proxy-nodes-z3-1-rofa1-k8s-local       1/1     Running   0          5d
modsec-audit-deployment-c4fdc5b5-djxs8      1/1     Running   0          14h
namespace-controller-6646b86568-s8qgx       1/1     Running   0          7d
networkpolicy-controller-6bc596f89d-gw4nl   1/1     Running   0          5d
openstack-cloud-provider-htzkn              1/1     Running   0          23m
openstack-cloud-provider-tdnj7              1/1     Running   0          11m
openstack-cloud-provider-zn7cn              1/1     Running   1          49m

If I try to execute rolling update the kops cluster validation will fail

0915 07:44:37.374604      93 instancegroups.go:442] Cluster did not pass validation, will retry in "30s": master "master-zone-3-1-1-rofa1-k8s-local" is missing kube-apiserver pod, master "master-zone-3-1-1-rofa1-k8s-local" is missing kube-controller-manager pod, master "master-zone-3-1-1-rofa1-k8s-local" is missing kube-scheduler pod, master "master-zone-1-1-1-rofa1-k8s-local" is missing kube-apiserver pod, master "master-zone-1-1-1-rofa1-k8s-local" is missing kube-controller-manager pod, master "master-zone-1-1-1-rofa1-k8s-local" is missing kube-scheduler pod, master "master-zone-2-1-1-rofa1-k8s-local" is missing kube-apiserver pod, master "master-zone-2-1-1-rofa1-k8s-local" is missing kube-controller-manager pod, master "master-zone-2-1-1-rofa1-k8s-local" is missing kube-scheduler pod.

For me this looks like all pods are missing that are located in /etc/kubernetes/manifests in each master

6. What did you expect to happen?

I expect that all pods are visible under kube-system

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

@zetaab zetaab added the kind/bug Categorizes issue or PR as related to a bug. label Sep 15, 2020
@zetaab
Copy link
Member Author

zetaab commented Sep 15, 2020

https://gist.github.com/zetaab/24f8d1402a11123a09fc35fe4168ec55

so static pods are there (apiserver, scheduled, controller-manager, etcds..), but those are not visible in API

We also see this happening in all new clusters that we create automatically in our e2e tests. If cluster is created with 1.17 / 1.18 and updated to 1.19 the static pods will be missing from the API

This same happens also in AWS (we do execute e2e tests there as well)

I0915 06:53:16.351163 70 instancegroups.go:442] Cluster did not pass validation, will retry in "30s": master "ip-172-20-35-23.eu-north-1.compute.internal" is missing kube-apiserver pod, master "ip-172-20-35-23.eu-north-1.compute.internal" is missing kube-controller-manager pod, master "ip-172-20-35-23.eu-north-1.compute.internal" is missing kube-scheduler pod.

@zetaab zetaab added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Sep 15, 2020
@zetaab zetaab changed the title kube-system critical pods not visible in kubectl get pods static pods not visible in kubectl get pods Sep 15, 2020
@zetaab
Copy link
Member Author

zetaab commented Sep 15, 2020

I can see this in logs:

Sep 15 08:31:35 master-zone-1-1-1-rofa1-k8s-local kubelet[3315]: E0915 08:31:35.760761 3315 kubelet.go:1576] Failed creating a mirror pod for "etcd-manager-main-master-zone-1-1-1-rofa1-k8s-local_kube-system(7974b24d667835b08eceeb7e48c06d7c)": pods "etcd-manager-main-master-zone-1-1-1-rofa1-k8s-local" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.securityContext.hostPID: Invalid value: true: Host PID is not allowed to be used spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]

so it looks like that if we have PSP turned on, static pods will not be visible in kubectl get pods -n kube-system

@zetaab zetaab added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Sep 15, 2020
@zetaab
Copy link
Member Author

zetaab commented Sep 15, 2020

reproduce instructions (cluster can be created to OpenStack also):

~/go/bin/kops create cluster \
  --cloud aws \
  --name testing.k8s.local \
  --state ${KOPS_STATE_STORE} \
  --zones=eu-north-1a \
  --master-count=1 \
  --node-count=1

~/go/bin/kops edit cluster testing.k8s.local

add following:

  kubeAPIServer:
    appendAdmissionPlugins:
    - PodSecurityPolicy

~/go/bin/kops update cluster testing.k8s.local --yes

then wait few minutes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant