Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

metricbeat - kubernetes event.module gets "connection refused" accessing /stats/summary endpoint on EKS #300

Closed
robertgates55 opened this issue Sep 24, 2019 · 5 comments · Fixed by #310

Comments

@robertgates55
Copy link

Chart version:
metricbeat 7.3.2

Kubernetes version:
1.13

Kubernetes provider: E.g. GKE (Google Kubernetes Engine)
AWS EKS

Helm Version:
v2.14.3 (client & server)

helm get release output

REVISION: 3
RELEASED: Tue Sep 24 09:43:08 2019
CHART: metricbeat-7.3.2
USER-SUPPLIED VALUES:
extraEnvs:
- name: ELASTICSEARCH_USERNAME
  valueFrom:
    secretKeyRef:
      key: username
      name: elastic-credentials
- name: ELASTICSEARCH_PASSWORD
  valueFrom:
    secretKeyRef:
      key: password
      name: elastic-credentials
metricbeatConfig:
  kube-state-metrics-metricbeat.yml: |
    metricbeat.modules:
    - module: kubernetes
      enabled: true
      metricsets:
        - state_node
        - state_deployment
        - state_replicaset
        - state_pod
        - state_container
      period: 10s
      hosts: ["${KUBE_STATE_METRICS_HOSTS}"]
    output.elasticsearch:
      username: '${ELASTICSEARCH_USERNAME}'
      password: '${ELASTICSEARCH_PASSWORD}'
      protocol: https
      hosts: ["elasticsearch.qa.duco.services:443"]
  metricbeat.yml: |
    system:
      hostfs: /hostfs
    metricbeat.modules:
    - module: kubernetes
      metricsets:
        - container
        - node
        - pod
        - system
        - volume
      period: 30s
      host: "${NODE_NAME}"
      hosts: ["http://${HOSTNAME}:10250"]
      processors:
      - add_kubernetes_metadata:
          in_cluster: true
    - module: kubernetes
      enabled: true
      metricsets:
        - event
    - module: system
      period: 10s
      metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
      processes: ['.*']
      process.include_top_n:
        by_cpu: 5
        by_memory: 5
    - module: system
      period: 1m
      metricsets:
        - filesystem
        - fsstat
      processors:
      - drop_event.when.regexp:
          system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
    output.elasticsearch:
      username: '${ELASTICSEARCH_USERNAME}'
      password: '${ELASTICSEARCH_PASSWORD}'
      protocol: https
      hosts: ["elasticsearch.qa.duco.services:443"]

COMPUTED VALUES:
affinity: {}
extraEnvs:
- name: ELASTICSEARCH_USERNAME
  valueFrom:
    secretKeyRef:
      key: username
      name: elastic-credentials
- name: ELASTICSEARCH_PASSWORD
  valueFrom:
    secretKeyRef:
      key: password
      name: elastic-credentials
extraVolumeMounts: ""
extraVolumes: ""
fullnameOverride: ""
hostPathRoot: /var/lib
image: docker.elastic.co/beats/metricbeat
imagePullPolicy: IfNotPresent
imagePullSecrets: []
imageTag: 7.3.2
kube-state-metrics:
  affinity: {}
  collectors:
    certificatesigningrequests: true
    configmaps: true
    cronjobs: true
    daemonsets: true
    deployments: true
    endpoints: true
    horizontalpodautoscalers: true
    ingresses: true
    jobs: true
    limitranges: true
    namespaces: true
    nodes: true
    persistentvolumeclaims: true
    persistentvolumes: true
    poddisruptionbudgets: true
    pods: true
    replicasets: true
    replicationcontrollers: true
    resourcequotas: true
    secrets: true
    services: true
    statefulsets: true
  global: {}
  hostNetwork: false
  image:
    pullPolicy: IfNotPresent
    repository: quay.io/coreos/kube-state-metrics
    tag: v1.6.0
  nodeSelector: {}
  podAnnotations: {}
  podSecurityPolicy:
    annotations: {}
    enabled: false
  prometheus:
    monitor:
      additionalLabels: {}
      enabled: false
      namespace: ""
  prometheusScrape: true
  rbac:
    create: true
  replicas: 1
  securityContext:
    enabled: true
    fsGroup: 65534
    runAsUser: 65534
  service:
    loadBalancerIP: ""
    nodePort: 0
    port: 8080
    type: ClusterIP
  serviceAccount:
    create: true
    imagePullSecrets: []
  tolerations: []
livenessProbe:
  failureThreshold: 3
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 5
managedServiceAccount: true
metricbeatConfig:
  kube-state-metrics-metricbeat.yml: |
    metricbeat.modules:
    - module: kubernetes
      enabled: true
      metricsets:
        - state_node
        - state_deployment
        - state_replicaset
        - state_pod
        - state_container
      period: 10s
      hosts: ["${KUBE_STATE_METRICS_HOSTS}"]
    output.elasticsearch:
      username: '${ELASTICSEARCH_USERNAME}'
      password: '${ELASTICSEARCH_PASSWORD}'
      protocol: https
      hosts: ["elasticsearch.qa.duco.services:443"]
  metricbeat.yml: |
    system:
      hostfs: /hostfs
    metricbeat.modules:
    - module: kubernetes
      metricsets:
        - container
        - node
        - pod
        - system
        - volume
      period: 30s
      host: "${NODE_NAME}"
      hosts: ["https://${HOSTNAME}:10250"]
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      ssl.verification_mode: none
      processors:
      - add_kubernetes_metadata:
          in_cluster: true
    - module: kubernetes
      enabled: true
      metricsets:
        - event
    - module: system
      period: 10s
      metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
      processes: ['.*']
      process.include_top_n:
        by_cpu: 5
        by_memory: 5
    - module: system
      period: 1m
      metricsets:
        - filesystem
        - fsstat
      processors:
      - drop_event.when.regexp:
          system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
    output.elasticsearch:
      username: '${ELASTICSEARCH_USERNAME}'
      password: '${ELASTICSEARCH_PASSWORD}'
      protocol: https
      hosts: ["elasticsearch.qa.duco.services:443"]
nameOverride: ""
nodeSelector: {}
podAnnotations: {}
podSecurityContext:
  privileged: false
  runAsUser: 0
readinessProbe:
  failureThreshold: 3
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 5
replicas: 1
resources:
  limits:
    cpu: 1000m
    memory: 200Mi
  requests:
    cpu: 100m
    memory: 100Mi
secretMounts: []
serviceAccount: ""
terminationGracePeriod: 30
tolerations: []
updateStrategy: RollingUpdate

HOOKS:
MANIFEST:

---
# Source: metricbeat/templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-metricbeat-config
  labels:
    app: "metricbeat-metricbeat"
    chart: "metricbeat-7.3.2"
    heritage: "Tiller"
    release: "metricbeat"
data:
  kube-state-metrics-metricbeat.yml: |
    metricbeat.modules:
    - module: kubernetes
      enabled: true
      metricsets:
        - state_node
        - state_deployment
        - state_replicaset
        - state_pod
        - state_container
      period: 10s
      hosts: ["${KUBE_STATE_METRICS_HOSTS}"]
    output.elasticsearch:
      username: '${ELASTICSEARCH_USERNAME}'
      password: '${ELASTICSEARCH_PASSWORD}'
      protocol: https
      hosts: ["elasticsearch.qa.duco.services:443"]
    
  metricbeat.yml: |
    system:
      hostfs: /hostfs
    metricbeat.modules:
    - module: kubernetes
      metricsets:
        - container
        - node
        - pod
        - system
        - volume
      period: 30s
      host: "${NODE_NAME}"
      hosts: ["https://${HOSTNAME}:10250"]
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      ssl.verification_mode: none
      processors:
      - add_kubernetes_metadata:
          in_cluster: true
    - module: kubernetes
      enabled: true
      metricsets:
        - event
    - module: system
      period: 10s
      metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
      processes: ['.*']
      process.include_top_n:
        by_cpu: 5
        by_memory: 5
    - module: system
      period: 1m
      metricsets:
        - filesystem
        - fsstat
      processors:
      - drop_event.when.regexp:
          system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib)($|/)'
    output.elasticsearch:
      username: '${ELASTICSEARCH_USERNAME}'
      password: '${ELASTICSEARCH_PASSWORD}'
      protocol: https
      hosts: ["elasticsearch.qa.duco.services:443"]
---
# Source: metricbeat/charts/kube-state-metrics/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app: kube-state-metrics
    chart: kube-state-metrics-1.6.0
    heritage: Tiller
    release: metricbeat
  name: metricbeat-kube-state-metrics
imagePullSecrets:
  []
---
# Source: metricbeat/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metricbeat-metricbeat
  labels:
    app: "metricbeat-metricbeat"
    chart: "metricbeat-7.3.2"
    heritage: "Tiller"
    release: "metricbeat"
---
# Source: metricbeat/charts/kube-state-metrics/templates/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  labels:
    app: kube-state-metrics
    chart: kube-state-metrics-1.6.0
    heritage: Tiller
    release: metricbeat
  name: metricbeat-kube-state-metrics
rules:

- apiGroups: ["certificates.k8s.io"]
  resources:
  - certificatesigningrequests
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["list", "watch"]

- apiGroups: ["batch"]
  resources:
  - cronjobs
  verbs: ["list", "watch"]

- apiGroups: ["extensions", "apps"]
  resources:
  - daemonsets
  verbs: ["list", "watch"]

- apiGroups: ["extensions", "apps"]
  resources:
  - deployments
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - endpoints
  verbs: ["list", "watch"]

- apiGroups: ["autoscaling"]
  resources:
  - horizontalpodautoscalers
  verbs: ["list", "watch"]

- apiGroups: ["extensions"]
  resources:
  - ingresses
  verbs: ["list", "watch"]

- apiGroups: ["batch"]
  resources:
  - jobs
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - limitranges
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - namespaces
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - nodes
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - persistentvolumeclaims
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - persistentvolumes
  verbs: ["list", "watch"]

- apiGroups: ["policy"]
  resources:
    - poddisruptionbudgets
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - pods
  verbs: ["list", "watch"]

- apiGroups: ["extensions", "apps"]
  resources:
  - replicasets
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - replicationcontrollers
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - resourcequotas
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - secrets
  verbs: ["list", "watch"]

- apiGroups: [""]
  resources:
  - services
  verbs: ["list", "watch"]

- apiGroups: ["apps"]
  resources:
  - statefulsets
  verbs: ["list", "watch"]
---
# Source: metricbeat/templates/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: metricbeat-metricbeat-cluster-role
  labels:
    app: "metricbeat-metricbeat"
    chart: "metricbeat-7.3.2"
    heritage: "Tiller"
    release: "metricbeat"
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  - events
  verbs:
  - get
  - list
  - watch
---
# Source: metricbeat/charts/kube-state-metrics/templates/clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  labels:
    app: kube-state-metrics
    chart: kube-state-metrics-1.6.0
    heritage: Tiller
    release: metricbeat
  name: metricbeat-kube-state-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: metricbeat-kube-state-metrics
subjects:
- kind: ServiceAccount
  name: metricbeat-kube-state-metrics
  namespace: kube-system
---
# Source: metricbeat/templates/clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: metricbeat-metricbeat-cluster-role-binding
  labels:
    app: "metricbeat-metricbeat"
    chart: "metricbeat-7.3.2"
    heritage: "Tiller"
    release: "metricbeat"
roleRef:
  kind: ClusterRole
  name: metricbeat-metricbeat-cluster-role
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: metricbeat-metricbeat
  namespace: kube-system
---
# Source: metricbeat/charts/kube-state-metrics/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: metricbeat-kube-state-metrics
  labels:
    app: kube-state-metrics
    chart: "kube-state-metrics-1.6.0"
    release: "metricbeat"
    heritage: "Tiller"
  annotations:
    prometheus.io/scrape: 'true'
spec:
  type: "ClusterIP"
  ports:
  - name: "http"
    protocol: TCP
    port: 8080
    targetPort: 8080
  selector:
    app: kube-state-metrics
    release: metricbeat
---
# Source: metricbeat/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: metricbeat-metricbeat
  labels:
    app: "metricbeat-metricbeat"
    chart: "metricbeat-7.3.2"
    heritage: "Tiller"
    release: "metricbeat"
spec:
  selector:
    matchLabels:
      app: "metricbeat-metricbeat"
      release: "metricbeat"
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      annotations:
        
        configChecksum: df66e12867fdb6d46e1b454cb7e3f91438913ba0bb9a287f77e9dbd68e32791
      name: "metricbeat-metricbeat"
      labels:
        app: "metricbeat-metricbeat"
        chart: "metricbeat-7.3.2"
        heritage: "Tiller"
        release: "metricbeat"
    spec:
      serviceAccountName: metricbeat-metricbeat
      terminationGracePeriodSeconds: 30
      volumes:
      - name: metricbeat-config
        configMap:
          defaultMode: 0600
          name: metricbeat-metricbeat-config
      - name: data
        hostPath:
          path: /var/lib/metricbeat-metricbeat-kube-system-data
          type: DirectoryOrCreate
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: varrundockersock
        hostPath:
          path: /var/run/docker.sock
      containers:
      - name: "metricbeat"
        image: "docker.elastic.co/beats/metricbeat:7.3.2"
        imagePullPolicy: "IfNotPresent"
        args:
        - "-e"
        - "-E"
        - "http.enabled=true"
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - |
              #!/usr/bin/env bash -e
              curl --fail 127.0.0.1:5066
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 5
          
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - |
              #!/usr/bin/env bash -e
              metricbeat test output
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 5
          
        resources:
          limits:
            cpu: 1000m
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
          
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: ELASTICSEARCH_USERNAME
          valueFrom:
            secretKeyRef:
              key: username
              name: elastic-credentials
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: elastic-credentials
        
        securityContext:
          privileged: false
          runAsUser: 0
          
        volumeMounts:
        - name: metricbeat-config
          mountPath: /usr/share/metricbeat/kube-state-metrics-metricbeat.yml
          readOnly: true
          subPath: kube-state-metrics-metricbeat.yml
        - name: metricbeat-config
          mountPath: /usr/share/metricbeat/metricbeat.yml
          readOnly: true
          subPath: metricbeat.yml
        - name: data
          mountPath: /usr/share/metricbeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        # Necessary when using autodiscovery; avoid mounting it otherwise
        # See: https://www.elastic.co/guide/en/beats/metricbeat/master/configuration-autodiscover.html
        - name: varrundockersock
          mountPath: /var/run/docker.sock
          readOnly: true
---
# Source: metricbeat/charts/kube-state-metrics/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metricbeat-kube-state-metrics
  labels:
    app: kube-state-metrics
    chart: "kube-state-metrics-1.6.0"
    release: "metricbeat"
    heritage: "Tiller"
spec:
  selector:
    matchLabels:
      app: kube-state-metrics
  replicas: 1
  template:
    metadata:
      labels:
        app: kube-state-metrics
        release: "metricbeat"
    spec:
      hostNetwork: false
      serviceAccountName: metricbeat-kube-state-metrics
      securityContext:
        fsGroup: 65534
        runAsUser: 65534
      containers:
      - name: kube-state-metrics
        args:

        - --collectors=certificatesigningrequests


        - --collectors=configmaps


        - --collectors=cronjobs


        - --collectors=daemonsets


        - --collectors=deployments


        - --collectors=endpoints


        - --collectors=horizontalpodautoscalers


        - --collectors=ingresses


        - --collectors=jobs


        - --collectors=limitranges


        - --collectors=namespaces


        - --collectors=nodes


        - --collectors=persistentvolumeclaims


        - --collectors=persistentvolumes


        - --collectors=poddisruptionbudgets


        - --collectors=pods


        - --collectors=replicasets


        - --collectors=replicationcontrollers


        - --collectors=resourcequotas


        - --collectors=secrets


        - --collectors=services


        - --collectors=statefulsets


        imagePullPolicy: IfNotPresent
        image: "quay.io/coreos/kube-state-metrics:v1.6.0"
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 5
        resources:
            null
---
# Source: metricbeat/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: 'metricbeat-metricbeat-metrics'
  labels:
    app: 'metricbeat-metricbeat-metrics'
    chart: 'metricbeat-7.3.2'
    heritage: 'Tiller'
    release: 'metricbeat'
spec:
  replicas: 1
  selector:
    matchLabels:
      app: 'metricbeat-metricbeat-metrics'
      chart: 'metricbeat-7.3.2'
      heritage: 'Tiller'
      release: 'metricbeat'
  template:
    metadata:
      annotations:
        
        configChecksum: df66e12867fdb6d46e1b454cb7e3f91438913ba0bb9a287f77e9dbd68e32791
      labels:
        app: 'metricbeat-metricbeat-metrics'
        chart: 'metricbeat-7.3.2'
        heritage: 'Tiller'
        release: 'metricbeat'
    spec:
      serviceAccountName: metricbeat-metricbeat
      terminationGracePeriodSeconds: 30
      volumes:
      - name: metricbeat-config
        configMap:
          defaultMode: 0600
          name: metricbeat-metricbeat-config
      containers:
      - name: "metricbeat"
        image: "docker.elastic.co/beats/metricbeat:7.3.2"
        imagePullPolicy: "IfNotPresent"
        args:
          - "-c"
          - "/usr/share/metricbeat/kube-state-metrics-metricbeat.yml"
          - "-e"
          - "-E"
          - "http.enabled=true"
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - |
              #!/usr/bin/env bash -e
              curl --fail 127.0.0.1:5066
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - |
              #!/usr/bin/env bash -e
              metricbeat test output
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 5
          
        resources:
          limits:
            cpu: 1000m
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
          
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: KUBE_STATE_METRICS_HOSTS
          value: "$(METRICBEAT_KUBE_STATE_METRICS_SERVICE_HOST):$(METRICBEAT_KUBE_STATE_METRICS_SERVICE_PORT_HTTP)"
        - name: ELASTICSEARCH_USERNAME
          valueFrom:
            secretKeyRef:
              key: username
              name: elastic-credentials
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: elastic-credentials
        
        securityContext:
          privileged: false
          runAsUser: 0
          
        volumeMounts:
        - name: metricbeat-config
          mountPath: /usr/share/metricbeat/kube-state-metrics-metricbeat.yml
          readOnly: true
          subPath: kube-state-metrics-metricbeat.yml
        - name: metricbeat-config
          mountPath: /usr/share/metricbeat/metricbeat.yml
          readOnly: true
          subPath: metricbeat.yml

Describe the bug:
I'm getting "connection refused" errors when the metricbeat module attempts to query from the /stats/summary endpoint.

Basic out of the box configuration is this:

    - module: kubernetes
      metricsets:
        - container
        - node
        - pod
        - system
        - volume
      period: 30s
      host: "${NODE_NAME}"
      hosts: ["http://${NODE_NAME}:10250"]
      processors:
      - add_kubernetes_metadata:
          in_cluster: true

But I've also tried various configurations/combinations

Given that the :10250 ro endpoint is marked as deprecated (and likely thus not being exposed by my EKS cluster), what is the recommended configuration to scrape these metrics?

@robertgates55
Copy link
Author

Additional info:
Changing the hosts to https rather than http like this:

    - module: kubernetes
      metricsets:
        - container
        - node
        - pod
        - system
        - volume
      period: 30s
      host: "${NODE_NAME}"
      hosts: ["https://${NODE_NAME}:10250"]
      processors:
      - add_kubernetes_metadata:
          in_cluster: true

Gives error making http request: Get https://ip-10-208-92-73.eu-west-1.compute.internal:10250/stats/summary: x509: certificate signed by unknown authority

So I updated the config to:

- module: kubernetes
          metricsets:
            - container
            - node
            - pod
            - system
            - volume
          period: 30s
          host: "${NODE_NAME}"
          hosts: ["https://${NODE_NAME}:10250"]
          ssl.verification_mode: none
          processors:
          - add_kubernetes_metadata:
              in_cluster: true

Which results in HTTP error 401 in : 401 Unauthorized

So I updated the config to:

- module: kubernetes
          metricsets:
            - container
            - node
            - pod
            - system
            - volume
          period: 30s
          host: "${NODE_NAME}"
          hosts: ["https://${NODE_NAME}:10250"]
          ssl.verification_mode: none
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
          processors:
          - add_kubernetes_metadata:
              in_cluster: true

Which gives HTTP error 403 in : 403 Forbidden

@Crazybus
Copy link
Contributor

Hi @robertgates55!

Could you try the configuration suggested in elastic/beats#10937 (comment) to see if it works for you? I don't have an EKS cluster to test on (and we don't have automated tests for it).

@robertgates55
Copy link
Author

So I've tried that:

        metricbeat.modules:
        - module: kubernetes
          metricsets:
            - container
            - node
            - pod
            - system
            - volume
          period: 30s
          host: "${NODE_NAME}"
          hosts: ["https://${NODE_NAME}:10250"]
          ssl.verification_mode: none
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
          ssl.certificate_authorities:
            - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

and now get:

2019-09-30T19:07:27.480Z WARN tlscommon/tls_config.go:79 SSL/TLS verifications disabled.
2019-09-30T19:07:27.829Z INFO kubernetes/watcher.go:182 kubernetes: Performing a resource sync for *v1.NodeList
2019-09-30T19:07:27.845Z ERROR kubernetes/watcher.go:185 kubernetes: Performing a resource sync err kubernetes api: Failure 403 nodes is forbidden: User "system:serviceaccount:kube-system:metricbeat-metricbeat" cannot list resource "nodes" in API group "" at the cluster scope for *v1.NodeList
2019-09-30T19:07:27.845Z WARN util/kubernetes.go:288 Error starting Kubernetes watcher: kubernetes api: Failure 403 nodes is forbidden: User "system:serviceaccount:kube-system:metricbeat-metricbeat" cannot list resource "nodes" in API group "" at the cluster scope
2019-09-30T19:07:27.845Z WARN tlscommon/tls_config.go:79 SSL/TLS verifications disabled.

So is it that the helm chart needs to grant more permissions to the serviceaccount?

@xellsys
Copy link

xellsys commented Oct 1, 2019

Hi, I believe I have encountered the same issue. I think there are some missing rights indicated in the logs:

ERROR kubernetes/watcher.go:185 kubernetes: Performing a resource sync err kubernetes api: Failure 403 deployments.apps is forbidden: User "system:serviceaccount:metricbeat:metricbeat-metricbeat" cannot list resource "deployments" in API group "apps" at the cluster scope for *v1beta1.DeploymentList
...
ERROR kubernetes/watcher.go:185 kubernetes: Performing a resource sync err kubernetes api: Failure 403 nodes is forbidden: User "system:serviceaccount:metricbeat:metricbeat-metricbeat" cannot list resource "nodes" in API group "" at the cluster scope for *v1.NodeList
...
ERROR kubernetes/watcher.go:185 kubernetes: Performing a resource sync err kubernetes api: Failure 403 replicasets.extensions is forbidden: User "system:serviceaccount:metricbeat:metricbeat-metricbeat" cannot list resource "replicasets" in API group "extensions" at the cluster scope for *v1beta1.ReplicaSetList

I have adapted the clusterrole to the following and do not receive error messages anymore:

- apiGroups:
  - "extensions"
  - "apps"
  - ""
  resources:
  - namespaces
  - pods
  - events
  - deployments
  - nodes
  - replicasets
  verbs:
  - get
  - list
  - watch

Also I do now see more info in the events, eg the key "kubernetes.node._module.labels.kubernetes_io/arch" is populated.

@Crazybus
Copy link
Contributor

Crazybus commented Oct 2, 2019

Thanks for all the information added here! There are indeed some resources missing and the rules are currently hardcoded. I just opened #310 which makes them configurable and adds more by default.

@jmlrt jmlrt closed this as completed in #310 Oct 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants