diff --git a/docs/en/observability/monitor-k8s/diagnose-k8s-bottlenecks.asciidoc b/docs/en/observability/monitor-k8s/diagnose-k8s-bottlenecks.asciidoc new file mode 100644 index 0000000000..2742199f17 --- /dev/null +++ b/docs/en/observability/monitor-k8s/diagnose-k8s-bottlenecks.asciidoc @@ -0,0 +1,10 @@ +[discrete] +== Part 6: Diagnose bottlenecks and other issues + +[Author: TBD? PM?] + +TODO: Describe how to explore a real problem by navigating +observability UIs and dashboards. This section should showcase the power of +using our observability solution (being able to correlate logs, metrics, and +traces to solve a specific, real-world problem). The section title needs to +match whatever scenario we decide to discuss. diff --git a/docs/en/observability/monitor-k8s/images/apm-app-kubernetes-filter.png b/docs/en/observability/monitor-k8s/images/apm-app-kubernetes-filter.png new file mode 100644 index 0000000000..e63fca9952 Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/apm-app-kubernetes-filter.png differ diff --git a/docs/en/observability/monitor-k8s/images/k8s-monitoring-architecture.png b/docs/en/observability/monitor-k8s/images/k8s-monitoring-architecture.png new file mode 100644 index 0000000000..85f459382b Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/k8s-monitoring-architecture.png differ diff --git a/docs/en/observability/monitor-k8s/images/k8s-overview.png b/docs/en/observability/monitor-k8s/images/k8s-overview.png new file mode 100644 index 0000000000..27ead9616e Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/k8s-overview.png differ diff --git a/docs/en/observability/monitor-k8s/images/log-stream.png b/docs/en/observability/monitor-k8s/images/log-stream.png new file mode 100644 index 0000000000..03559061c9 Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/log-stream.png differ diff --git a/docs/en/observability/monitor-k8s/images/metadata-processors.png b/docs/en/observability/monitor-k8s/images/metadata-processors.png new file mode 100644 index 0000000000..a386746c2f Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/metadata-processors.png differ diff --git a/docs/en/observability/monitor-k8s/images/metrics-explorer.png b/docs/en/observability/monitor-k8s/images/metrics-explorer.png new file mode 100644 index 0000000000..494a8802a5 Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/metrics-explorer.png differ diff --git a/docs/en/observability/monitor-k8s/images/metrics-inventory.png b/docs/en/observability/monitor-k8s/images/metrics-inventory.png new file mode 100644 index 0000000000..478888190e Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/metrics-inventory.png differ diff --git a/docs/en/observability/monitor-k8s/images/spring-apm-app-2.png b/docs/en/observability/monitor-k8s/images/spring-apm-app-2.png new file mode 100644 index 0000000000..d83e21cf79 Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/spring-apm-app-2.png differ diff --git a/docs/en/observability/monitor-k8s/images/system-overview.png b/docs/en/observability/monitor-k8s/images/system-overview.png new file mode 100644 index 0000000000..778b526ecf Binary files /dev/null and b/docs/en/observability/monitor-k8s/images/system-overview.png differ diff --git a/docs/en/observability/monitor-k8s/monitor-k8s-application-performance.asciidoc b/docs/en/observability/monitor-k8s/monitor-k8s-application-performance.asciidoc new file mode 100644 index 0000000000..67561bd083 --- /dev/null +++ b/docs/en/observability/monitor-k8s/monitor-k8s-application-performance.asciidoc @@ -0,0 +1,221 @@ +[discrete] +[[monitor-kubernetes-application-performance]] +== Part 3: Monitor application performance + +Quickly triage and troubleshoot application performance problems with the help of Elastic +application performance monitoring (APM). + +Think of a latency spike -- APM can help you narrow the scope of your investigation to a single service. +Because you've also ingested and correlated logs and metrics, you can then link the problem to CPU and memory utilization or error log entries of a particular Kubernetes pod. + +[discrete] +=== Step 1: Set up APM Server + +Application monitoring data is streamed from your applications running in Kubernetes to APM Server, +where it is validated, processed, and transformed into {es} documents. + +There are many ways to deploy APM Server when working with Kubernetes, +but this guide assumes that you're using our hosted {ess} on {ecloud}. +If you haven't done so already, enable APM Server in the {ess-console}[{ess} console]. + +If you want to manage APM Server yourself, there are a few alternative options: + +[%collapsible] +.Expand alternatives +==== +* {eck-ref}/[Elastic Cloud on Kubernetes (ECK)] -- The Elastic recommended approach for managing +APM Server deployed with Kubernetes. +Built on the Kubernetes Operator pattern, ECK extends basic Kubernetes orchestration capabilities +to support the setup and management of APM Server on Kubernetes. + +* Deploy APM Server as a DaemonSet -- Ensure a running instance of APM Server on each node in your cluster. +Useful when all pods in a node should share a single APM Server instance. + +* Deploy APM Server as a sidecar -- For environments that should not share an APM Server, +like when directing traces from multiple applications to separate {es} clusters. + +* {apm-server-ref-v}/installing.html[Download and install APM Server] -- The classic, non-Kubernetes option. +==== + +[discrete] +=== Step 2: Save your secret token + +A {apm-server-ref-v}/secret-token.html[secret token] is used to secure communication between APM agents +and APM Server. On the {ecloud} deployment page, select *APM* and copy your APM Server secret token. +To avoid exposing the secret token, you can store it in a Kubernetes secret. For example: + +[source, cmd] +---- +kubectl create secret generic apm-secret --from-literal=ELASTIC_APM_SECRET_TOKEN=asecretpassword --namespace=kube-system <1> +---- +<1> Create the secret in the same namespace that you'll be deploying your applications in. + +If you're managing APM Server yourself, +see {apm-server-ref-v}/secret-token.html[secret token] for instructions on how to set up your secret token. + +If you are using ECK to set up APM Server, the operator automatically generates an `{APM-server-name}-apm-token` secret for you. + +[discrete] +=== Step 3: Install and configure APM Agents + +In most cases, setting up APM agents and thereby instrumenting your applications +is as easy as installing a library and adding a few lines of code. + +Select your application's language for details: + +include::{shared}/install-apm-agents-kube/widget.asciidoc[] + +[discrete] +=== Step 4: Configure Kubernetes data + +In most instances, APM agents automatically read Kubernetes data from inside the +container and send it to APM Server. +If this is not the case, or if you wish to override this data, +you can set environment variables for the agents to read. +These environment variable are set via the https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/#the-downward-api[Downward API] +in your kubernetes pod spec: + +[source,yml] +---- + # ... + containers: + - name: your-app-container + env: + # ... + - name: KUBERNETES_NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + - name: KUBERNETES_POD_NAME + valueFrom: + fieldRef: + fieldPath: metadata.name + - name: KUBERNETES_NAMESPACE + valueFrom: + fieldRef: + fieldPath: metadata.namespace + - name: KUBERNETES_POD_UID + valueFrom: + fieldRef: + fieldPath: metadata.uid +---- + +The table below maps these environment variables to the APM metadata event field: + +[options="header"] +|===== +|Environment variable |Metadata field name +|KUBERNETES_NODE_NAME |system.kubernetes.node.name +|KUBERNETES_POD_NAME |system.kubernetes.pod.name +|KUBERNETES_NAMESPACE |system.kubernetes.namespace +|KUBERNETES_POD_UID |system.kubernetes.pod.uid +|===== + +[discrete] +=== Step 5: Deploy your application + +APM agents are deployed with your application. + +[%collapsible] +.Resource configuration file example +==== + +A complete resource configuration file based on the previous steps. + +[source,yml] +---- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: <> + namespace: kube-system + labels: + app: <> + service: <> +spec: + replicas: 1 + selector: + matchLabels: + app: <> + template: + metadata: + labels: + app: <> + service: <> + spec: + dnsPolicy: ClusterFirstWithHostNet + volumes: + - name: elastic-apm-agent + emptyDir: {} + initContainers: + - name: elastic-java-agent + image: docker.elastic.co/observability/apm-agent-java:1.12.0 + volumeMounts: + - mountPath: /elastic/apm/agent + name: elastic-apm-agent + command: ['cp', '-v', '/usr/agent/elastic-apm-agent.jar', '/elastic/apm/agent'] + containers: + - name: <> + image: <> + volumeMounts: + - mountPath: /elastic/apm/agent + name: elastic-apm-agent + env: + - name: ELASTIC_APM_SERVER_URL + value: "apm-server-url-goes-here" + - name: ELASTIC_APM_SECRET_TOKEN + valueFrom: + secretKeyRef: + name: apm-secret + key: ELASTIC_APM_SECRET_TOKEN + - name: ELASTIC_APM_SERVICE_NAME + value: "petclinic" + - name: ELASTIC_APM_APPLICATION_PACKAGES + value: "org.springframework.samples.petclinic" + - name: ELASTIC_APM_ENVIRONMENT + value: test + - name: JAVA_TOOL_OPTIONS + value: -javaagent:/elastic/apm/agent/elastic-apm-agent.jar + - name: KUBERNETES_NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + - name: KUBERNETES_POD_NAME + valueFrom: + fieldRef: + fieldPath: metadata.name + - name: KUBERNETES_NAMESPACE + valueFrom: + fieldRef: + fieldPath: metadata.namespace + - name: KUBERNETES_POD_UID + valueFrom: + fieldRef: + fieldPath: metadata.uid +---- +==== + +[source,cmd] +---- +kubectl apply -f demo.yml +---- + +[discrete] +=== View your traces in Kibana + +To view your application's trace data, open Kibana and go to *Observability* *>* *APM*. + +The APM app allows you to monitor your software services and applications in real-time: +visualize detailed performance information on your services, identify and analyze errors, +and monitor host-level and agent-specific metrics like JVM and Go runtime metrics. + +[screenshot] +image::images/spring-apm-app-2.png[APM app kubernetes] + +Having access to application-level insights with just a few clicks can drastically decrease the time you spend debugging errors, slow response times, and crashes. + +Best of all, because Kubernetes environment variables have been mapped to APM metadata events, +you can filter your trace data by Kubernetes `namespace`, `node.name`, `pod.name`, and `pod.uid`. + +[screenshot] +image::images/apm-app-kubernetes-filter.png[APM app kubernetes] diff --git a/docs/en/observability/monitor-k8s/monitor-k8s-logs.asciidoc b/docs/en/observability/monitor-k8s/monitor-k8s-logs.asciidoc new file mode 100644 index 0000000000..2994f5b222 --- /dev/null +++ b/docs/en/observability/monitor-k8s/monitor-k8s-logs.asciidoc @@ -0,0 +1,357 @@ +[discrete] +[[monitor-kubernetes-logs]] +== Part 1: Monitor logs + +Collecting and analyzing logs of both Kubernetes Core components and various +applications running on top of Kubernetes is a powerful tool for Kubernetes +observability. Containers running within Kubernetes pods publish logs to stdout +or stderr. These logs are written to a location known to Kubelet. + +To collect pod logs, all you need is {filebeat} running as a DaemonSet +in your Kubernetes cluster. You configure {filebeat} to communicate with the +Kubernetes API server, get the list of pods running on the current host, and +collect the logs the pods are producing. Those logs are annotated with all the +relevant Kubernetes metadata, such as pod ID, container name, container labels +and annotations, and so on. + +[discrete] +=== Deploy {filebeat} to collect logs + +To start collecting logs, deploy and run an instance of {filebeat} on each +Kubernetes host. {filebeat} communicates with the Kubernetes API server to +retrieve information about the pods running on the host and all the metadata +annotations. + +To deploy {filebeat} to your Kubernetes cluster: + +[discrete] +==== Step 1: Download the {filebeat} deployment manifest + +To make deployment easier, Elastic provides a YAML file that defines all the +required deployment settings. In many cases, you can change the connection +details and deploy with default settings to get started quickly. + +["source", "sh", subs="attributes"] +---- +curl -L -O https://raw.githubusercontent.com/elastic/beats/{branch}/deploy/kubernetes/filebeat-kubernetes.yaml +---- + +[discrete] +==== Step 2: Set the connection information for {es} + +By default {filebeat} sends events to an existing {es} deployment on `elasticsearch:9200`, if present. +To specify a different destination, change the following settings in the +`filebeat-kubernetes.yaml` file: + +[source,yaml] +---- +env: +- name: ELASTICSEARCH_HOST + value: elasticsearch +- name: ELASTICSEARCH_PORT + value: "9200" +- name: ELASTICSEARCH_USERNAME + value: elastic <1> +- name: ELASTICSEARCH_PASSWORD + value: changeme +- name: ELASTIC_CLOUD_ID <2> + value: +- name: ELASTIC_CLOUD_AUTH <2> + value: +---- +<1> This user must have the privileges required to publish events to {es}. For +more information, see {filebeat-ref}/feature-roles.html[Grant users access to secured resources]. +<2> Use the cloud settings if you're sending data to {ess} on {ecloud}. + +To avoid exposing sensitive data, you can base64 encode the string, then store it +in a Kubernetes secret. For example: + +["source", "sh", subs="attributes"] +------------------------------------------------ +$ echo -n 'changeme' | base64 +Y2hhbmdlbWU= +$ kubectl create secret generic es-secret --from-literal=password='Y2hhbmdlbWU=' --namespace=kube-system <1> +------------------------------------------------ +<1> Create the secret in the namespace where you will deploy {filebeat}. + +To use the secret, change the `env` setting in the manifest file: + +[source,yaml] +------------------------------------------------ +env: +- name: ELASTICSEARCH_PASSWORD + valueFrom: + secretKeyRef: + name: es-secret + key: password +------------------------------------------------ + +[discrete] +==== Step 3: Collect container logs + +To collect container logs, each {filebeat} instance needs access to the local +log's path, which is actually a log directory mounted from the host. The +default deployment manifest already contains the configuration to do this: + +[source,yaml] +------------------------------------------------ +filebeat.inputs: +- type: container + paths: + - /var/log/containers/*.log +------------------------------------------------ + +With this configuration, {filebeat} can collect logs from all the files that +exist under the `/var/log/containers/` directory. + +This configuration assumes you know what kinds of components are running in a +pod and where the container logs are stored. Later you'll learn how to use +autodiscovery to <>. + +[discrete] +==== Step 4: Add metadata to events + +{filebeat} provides processors that you can use in your configuration to enrich +events with metadata coming from Docker, Kubernetes, hosts, and cloud providers. + +The `add_kubernetes_metadata` processor is already specified in the default +configuration. This processor adds Kubernetes and container-related metadata to +the logs: + +[source,yaml] +------------------------------------------------ +filebeat.inputs: +- type: container + paths: + - /var/log/containers/*.log + processors: + - add_kubernetes_metadata: + host: ${NODE_NAME} + matchers: + - logs_path: + logs_path: "/var/log/containers/" +------------------------------------------------ + +[discrete] +[[autodiscover-containers]] +==== Step 5: Automatically discover container logs (use autodiscovery) + +In the previous steps, you learned how to collect container logs and enrich them +with metadata. However you can take it further by leveraging the autodiscovery +mechanism in {filebeat}. With autodiscovery, {filebeat} can automatically +discover what kind of components are running in a pod and apply the logging +modules needed to capture logs for those components. + +NOTE: If you decide to use use autodiscovery, make sure you remove or comment +out the `filebeat.inputs` configuration. + +To configure autodiscovery, you can use static templates. For example, the +template in this example configures {filebeat} to collect NGINX logs from any +pod labeled as `app: nginx`. + +[source,yaml] +------------------------------------------------ +filebeat.autodiscover: + providers: + - type: kubernetes + node: ${NODE_NAME} + templates: + - condition: + equals: + kubernetes.labels.app: "nginx" + config: + - module: nginx + fileset.stdout: access + fileset.sterr: error +------------------------------------------------ + +This is good, but requires advanced knowledge of the workloads running in +Kubernetes. Each time you want to monitor something new, you'll need to +re-configure and restart {filebeat}. To avoid this, you can use hints-based +autodiscovery: + +[source,yaml] +------------------------------------------------ +filebeat.autodiscover: + providers: + - type: kubernetes + node: ${NODE_NAME} + hints.enabled: true + hints.default_config: + type: container + paths: + - /var/log/containers/*${data.kubernetes.container.id}.log +------------------------------------------------ + +Then annotate the pods accordingly: + +[source,yaml] +------------------------------------------------ +apiVersion: v1 +kind: Pod +metadata: + name: nginx-autodiscover + annotations: + co.elastic.logs/module: nginx + co.elastic.logs/fileset.stdout: access + co.elastic.logs/fileset.stderr: error +------------------------------------------------ + +With this setup, {filebeat} identifies the NGINX app and starts collecting its +logs by using the `nginx` module. + +[discrete] +==== Step 6: (optional) Drop unwanted events + +You can enrich your configuration with additional processors to drop unwanted +events. For example: + +[source,yaml] +------------------------------------------------ +processors: +- drop_event: + when: + - equals: + kubernetes.container.name: "metricbeat" +------------------------------------------------ + +[discrete] +==== Step 7: Enrich events with cloud metadata and host metadata + +You can also enrich events with cloud and host metadata by specifying the +`add_cloud_metadata` and `add_host_metadata` processors. These processors are +already specified in the default configuration: + +[source,yaml] +------------------------------------------------ +processors: +- add_cloud_metadata: +- add_host_metadata: +------------------------------------------------ + +[discrete] +==== Step 8: Deploy {filebeat} as a DaemonSet on Kubernetes + +. If you're running {filebeat} on master nodes, check to see if the nodes use +https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/[taints]. +Taints limit the workloads that can run on master nodes. If necessary, update +the DaemonSet spec to include tolerations: ++ +[source,yaml] +------------------------------------------------ +spec: + tolerations: + - key: node-role.kubernetes.io/master + effect: NoSchedule +------------------------------------------------ + +. Deploy {filebeat} to Kubernetes: ++ +["source", "sh", subs="attributes"] +------------------------------------------------ +kubectl create -f filebeat-kubernetes.yaml +------------------------------------------------ ++ +To check the status, run: ++ +["source", "sh", subs="attributes"] +------------------------------------------------ +$ kubectl --namespace=kube-system get ds/filebeat + +NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE-SELECTOR AGE +filebeat 32 32 0 32 0 1m +------------------------------------------------ ++ +Log events should start flowing to {es}. + +[discrete] +==== Red Hat OpenShift configuration + +If you're using Red Hat OpenShift, you need to specify additional settings in +the manifest file and enable the container to run as privileged. + +// Begin collapsed section + +[%collapsible] +.Click to see more +==== +. Modify the `DaemonSet` container spec in the manifest file: ++ +[source,yaml] +----- + securityContext: + runAsUser: 0 + privileged: true +----- + +. Grant the `filebeat` service account access to the privileged SCC: ++ +[source,shell] +----- +oc adm policy add-scc-to-user privileged system:serviceaccount:kube-system:filebeat +----- ++ +This command enables the container to be privileged as an administrator for +OpenShift. + +. Override the default node selector for the `kube-system` namespace (or your +custom namespace) to allow for scheduling on any node: ++ +[source,shell] +---- +oc patch namespace kube-system -p \ +'{"metadata": {"annotations": {"openshift.io/node-selector": ""}}}' +---- ++ +This command sets the node selector for the project to an empty string. If you +don't run this command, the default node selector will skip master nodes. + +==== +// End collapsed section + +[discrete] +=== View logs in {kib} + +To view the log data collected by {filebeat}, open {kib} and go to +**Observability > Logs**. + +The https://www.elastic.co/log-monitoring[Logs app] in {kib} allows you to +search, filter, and tail all the logs collected into the {stack}. Instead of +having to ssh into different servers and tail individual files, all the logs are +available in one tool under the Logs app. + +[role="screenshot"] +image::images/log-stream.png[Logs app streaming messages collected by {filebeat}] + +Explore the Logs app: + +* Enter a keyword or text string in the search field to filter logs. +* Use the time picker or timeline view on the side to move forward and back in +time. +* Click **Stream live** to watch the logs update in front of you `tail -f` +style. +* Place your cursor over a log message to highlight it, then use the context +menu to view details or view the log message in context. + + +[discrete] +==== Out-of-the-box {kib} dashboards + +{filebeat} ships with a variety of pre-built {kib} dashboards that you can +use to visualize logs from Kubernetes Core components and applications running +on top of Kubernetes. If these dashboards are not already loaded into {kib}, you +must run the {filebeat} setup job. + +TIP: To run the setup job, install {filebeat} on any system that can connect to +the {stack}, enable the modules for the datasets you want to monitor, then run +the `setup` command. To learn how, see the +{filebeat-ref}/filebeat-installation-configuration.html[{filebeat} quick start]. + + +After loading the dashboards, navigate to **{kib} > Dashboards** +and search for the services you want to monitor, like MySQL or NGINX. + +//TODO: Add screen capture here + +Notice that modules capture more than logs. You can also use them to capture +metrics. diff --git a/docs/en/observability/monitor-k8s/monitor-k8s-metrics.asciidoc b/docs/en/observability/monitor-k8s/monitor-k8s-metrics.asciidoc new file mode 100644 index 0000000000..0eca3de5de --- /dev/null +++ b/docs/en/observability/monitor-k8s/monitor-k8s-metrics.asciidoc @@ -0,0 +1,470 @@ +[discrete] +[[monitor-kubernetes-health-and-performance-metrics]] +== Part 2: Monitor health and performance metrics + +Collecting metrics about Kubernetes clusters and the workloads running on top of +them is a key aspect of Kubernetes observability. However, collecting metrics +from Kubernetes poses some challenges. You need to collect metrics about the +resources running on "physical" machines as well as the containers and pods. +Specifically, you need to monitor the health and performance of: + +* The hosts where Kubernetes components are running. Each host produces metrics +like CPU, memory, disk utilization, and disk and network I/O. + +* Kubernetes containers, which produce their own set of metrics. + +* The applications running as Kubernetes pods, such as application servers and +databases, each producing its own set of metrics. + +Instead of using multiple technologies to collect metrics, you deploy +{metricbeat} to monitor all layers of your technology stack. + +[discrete] +=== Deploy {metricbeat} to collect metrics + +You'll use {metricbeat} to collect metrics from pods running in your Kubernetes +cluster as well as metrics from the Kubernetes cluster itself. + +{metricbeat} modules provide a quick and easy way to pick up metrics from +various sources and ship them to {es} as {ecs-ref}/index.html[ECS]-compatible +events, ready to be correlated with logs, uptime, and APM data. + +To deploy {metricbeat} to your Kubernetes cluster: + +[discrete] +==== Step 1: Download the {metricbeat} deployment manifest + +To make deployment easier, Elastic provides a YAML file that defines all the +required deployment settings. In many cases, you can change the connection +details and deploy with default settings to get started quickly. + +["source", "sh", subs="attributes"] +------------------------------------------------ +curl -L -O https://raw.githubusercontent.com/elastic/beats/{branch}/deploy/kubernetes/metricbeat-kubernetes.yaml +------------------------------------------------ + + +[discrete] +==== Step 2: Set the connection information for {es} + +{metricbeat} sends events to an existing {es} deployment, if present. +To specify a different destination, change the following parameters in the +`metricbeat-kubernetes.yaml` file: + +[source,yaml] +---- +env: +- name: ELASTICSEARCH_HOST + value: elasticsearch +- name: ELASTICSEARCH_PORT + value: "9200" +- name: ELASTICSEARCH_USERNAME + value: elastic <1> +- name: ELASTICSEARCH_PASSWORD + value: changeme +- name: ELASTIC_CLOUD_ID <2> + value: +- name: ELASTIC_CLOUD_AUTH <2> + value: +---- +<1> This user must have the privileges required to publish events to {es}. For +more information, see {metricbeat-ref}/feature-roles.html[Grant users access to secured resources]. +<2> Use the cloud settings if you're sending data to {ess} on {ecloud}. + +To avoid exposing sensitive data, you can base64 encode the string, then store +it in a Kubernetes secret. For example: + +["source", "sh", subs="attributes"] +------------------------------------------------ +$ echo -n 'changeme' | base64 +Y2hhbmdlbWU= +$ kubectl create secret generic es-secret --from-literal=password='Y2hhbmdlbWU=' --namespace=kube-system <1> +------------------------------------------------ +<1> Create the secret in the namespace where you will deploy {metricbeat}. + +To use the secret, change the env setting in the manifest file: + +[source,yaml] + +------------------------------------------------ +env: +- name: ELASTICSEARCH_PASSWORD + valueFrom: + secretKeyRef: + name: es-secret + key: password +------------------------------------------------ + +[discrete] +==== Step 3: Mount paths + +{metricbeat} will run on each node in the cluster as a Daemonset's pod. +To collect system-level metrics, key paths are mounted from the host to the pod: + +[source,yaml] +------------------------------------------------ +- name: proc + hostPath: + path: /proc +- name: cgroup + hostPath: + path: /sys/fs/cgroup +------------------------------------------------ + +[discrete] +==== Step 4: Collect system metrics + +To collect system-level metrics from the running node, configure the `system` +module. The metricsets that you're likely to want are already enabled in the +manifest. Modify the settings as required for your environment: + +[source,yaml] +------------------------------------------------ +- module: system + period: 10s + metricsets: + - cpu + - load + - memory + - network + - process + - process_summary +------------------------------------------------ + +[discrete] +==== Step 5: Collect metrics from each Kubernetes node + +Because {metricbeat} is running on each node, you can collect metrics from the +Kubelet API. These metrics provide important information about the state of the +Kubernetes node, pods, containers, and other resources. + +To collect these metrics, configure the `kubernetes` module. The metricsets that +you're likely to want are already enabled in the manifest. Modify the settings +as required for your environment: + +[source,yaml] +------------------------------------------------ +- module: kubernetes + metricsets: + - node + - system + - pod + - container + - volume +------------------------------------------------ + +These metricsets collect metrics from the Kubelet API and therefore require +access to the specific endpoints. Depending on the version and configuration of +Kubernetes nodes, kubelet might provide a read-only HTTP port (typically +10255), which is used in some configuration examples. But in general, and +lately, this endpoint requires SSL (HTTPS) access (to port 10250 by default) and +token-based authentication. + +[discrete] +==== Step 6: Collect Kubernetes state metrics + +{metricbeat} gets some metrics from +https://github.com/kubernetes/kube-state-metrics#usage[kube-state-metrics]. +If kube-state-metrics is not already running, deploy it now. To learn how, +see the Kubernetes deployment +https://github.com/kubernetes/kube-state-metrics#kubernetes-deployment[docs]. + +To collect state metrics: + +. Enable metricsets that begin with the `state_` prefix. The metricsets that +you're likely to want are already enabled in the manifest. + +. Set the `hosts` field to point to the kube-state-metrics service within the +cluster. + +Because the kube-state-metrics service provides cluster-wide metrics, there’s no +need to fetch them per node. To use this singleton approach, {metricbeat} +leverages a leader election method, where one pod holds a leader lock and is +responsible for collecting cluster-wide metrics. For more information about +leader election settings, see +{metricbeat-ref}/configuration-autodiscover.html[Autodiscover]. + +[source,yaml] +------------------------------------------------ +metricbeat.autodiscover: + providers: + - type: kubernetes + scope: cluster + node: ${NODE_NAME} + unique: true + templates: + - config: + - module: kubernetes + hosts: ["kube-state-metrics:8080"] + period: 10s + add_metadata: true + metricsets: + - state_node + - state_deployment + - state_daemonset + - state_replicaset + - state_pod + - state_container + - state_cronjob + - state_resourcequota + - state_statefulset +------------------------------------------------ + +NOTE: If your Kubernetes cluster contains a large number of large nodes, the pod +that collects cluster-level metrics might face performance issues caused by +resource limitations. In this case, avoid using the leader election strategy and +instead run a dedicated, standalone {metricbeat} instance using a Deployment in +addition to the DaemonSet. + +[discrete] +==== Step 7: Collect application-specific metrics (use hint-based autodiscovery) + +{metricbeat} supports autodiscovery based on hints from the provider. The hints +system looks for hints in Kubernetes pod annotations or Docker labels that have +the prefix `co.elastic.metrics`. When a container starts, {metricbeat} checks +for hints and launches the proper configuration. The hints tell {metricbeat} how +to get metrics for the given container. To enable hint-based autodiscovery, set +`hints.enabled: true`: + +[source,yaml] +------------------------------------------------ +metricbeat.autodiscover: + providers: + - type: kubernetes + hints.enabled: true +------------------------------------------------ + +By labeling Kubernetes pods with the `co.elastic.metrics` prefix you can signal {metricbeat} to collect metrics from those pods using the appropriate modules: + +[source,yaml] +------------------------------------------------ +apiVersion: v1 +kind: Pod +metadata: + name: nginx-autodiscover + annotations: + co.elastic.metrics/module: nginx + co.elastic.metrics/metricsets: stubstatus + co.elastic.metrics/hosts: '${data.host}:80' + co.elastic.metrics/period: 10s +------------------------------------------------ + +[discrete] +==== Step 8: Collect metrics from Prometheus + +To enrich your collection resources, you can use the `prometheus` module to +collect metrics from every application that runs on the cluster and exposes a +Prometheus exporter. For instance, let's say that the cluster runs multiple +applications that expose Prometheus metrics with the default Prometheus +standards. Assuming these applications are annotated properly, you can define +an extra autodiscovery provider to automatically identify the applications and +start collecting exposed metrics by using the `prometheus` module: + +[source,yaml] +------------------------------------------------ +metricbeat.autodiscover: + providers: + - type: kubernetes + include_annotations: ["prometheus.io.scrape"] + templates: + - condition: + contains: + kubernetes.annotations.prometheus.io/scrape: "true" + config: + - module: prometheus + metricsets: ["collector"] + hosts: "${data.host}:${data.port}" +------------------------------------------------ + +This configuration launches a `prometheus` module for all containers of pods +annotated with `prometheus.io/scrape: "true"`. + +[discrete] +==== Step 9: Add metadata to events + +{metricbeat} provides processors that you can use in your configuration to +enrich events with metadata coming from Docker, Kubernetes, hosts, and cloud +providers. + +The `add_cloud_metadata` and `add_host_metadata` processors are already +specified in the default configuration: + +[source,yaml] +------------------------------------------------ +processors: +- add_cloud_metadata: +- add_host_metadata: +------------------------------------------------ + +This metadata allows correlation of metrics with the hosts, Kubernetes pods, +Docker containers, and cloud-provider infrastructure metadata and with other +pieces of observability puzzle, such as application performance monitoring data +and logs. + +[discrete] +==== Step 10: Deploy {metricbeat} as a DaemonSet on Kubernetes + +To deploy {metricbeat} to Kubernetes, run: + +[source,shell] +------------------------------------------------ +kubectl create -f metricbeat-kubernetes.yaml +------------------------------------------------ + +To check the status, run: + +[source,shell] +------------------------------------------------ +$ kubectl --namespace=kube-system get ds/metricbeat + +NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE-SELECTOR AGE +metricbeat 32 32 0 32 0 1m +------------------------------------------------ + +Metrics should start flowing to {es}. + +//REVIEWERS: Can we add some guidance here for what to do when this doesn't +//happen? How do users start to troubleshoot Beats running on k8s? Same comment +//applies to log monitoring. + +[discrete] +==== Red Hat OpenShift configuration + +If you're using Red Hat OpenShift, you need to specify additional settings in +the manifest file and enable the container to run as privileged. + +// Begin collapsed section + +[%collapsible] +.Click to see more +==== +. Modify the `DaemonSet` container spec in the manifest file: ++ +[source,yaml] +----- + securityContext: + runAsUser: 0 + privileged: true +----- + +. In the manifest file, edit the metricbeat-daemonset-modules ConfigMap, and +specify the following settings under `kubernetes.yml` in the data section: ++ +[source,yaml] +----- +kubernetes.yml: |- + - module: kubernetes + metricsets: + - node + - system + - pod + - container + - volume + period: 10s + host: ${NODE_NAME} + hosts: ["https://${NODE_NAME}:10250"] + bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token + ssl.certificate_authorities: + - /path/to/kubelet-service-ca.crt +----- ++ +[NOTE] +========================= +`kubelet-service-ca.crt` can be any CA bundle that contains the issuer of +the certificate used in the Kubelet API. According to each specific installation +of Openshift this can be found either in secrets or in configmaps. In some +installations it can be available as part of the service account secret, in +`/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt`. If you're using +the +https://github.com/openshift/installer/blob/master/docs/user/gcp/install.md[Openshift +installer] for GCP then the following configmap can be mounted in {metricbeat} +pod and use `ca-bundle.crt` in `ssl.certificate_authorities`: + +[source,yaml] +----- + Name: kubelet-serving-ca + Namespace: openshift-kube-apiserver + Labels: + Annotations: + + Data + ==== + ca-bundle.crt: +----- +========================= + +. Under the `metricbeat` ClusterRole, add the following resources: ++ +[source,yaml] +----- +- nodes/metrics +- nodes/stats +----- + +. Grant the `metricbeat` service account access to the privileged SCC: ++ +[source,shell] +----- +oc adm policy add-scc-to-user privileged system:serviceaccount:kube-system:filebeat +----- ++ +This command enables the container to be privileged as an administrator for +OpenShift. + +. Override the default node selector for the `kube-system` namespace (or your +custom namespace) to allow for scheduling on any node: ++ +[source,shell] +---- +oc patch namespace kube-system -p \ +'{"metadata": {"annotations": {"openshift.io/node-selector": ""}}}' +---- ++ +This command sets the node selector for the project to an empty string. If you +don't run this command, the default node selector will skip master nodes. + +==== +// End collapsed section + +[discrete] +=== View performance and health metrics + +To view the performance and health metrics collected by {metricbeat}, open +{kib} and go to **Observability > Metrics**. + +On the **Inventory** page, you can switch between different views to see an +overview of the containers and pods running on Kubernetes: + +[role="screenshot"] +image::images/metrics-inventory.png[Inventory page that shows Kubernetes pods] + +On the **Metrics Explorer** page, you can group and analyze metrics for the +resources that you are monitoring. + +[role="screenshot"] +image::images/metrics-explorer.png[Metrics dashboard that shows CPU usage for Kubernetes pods] + +Notice how everywhere you go in {kib}, there is a search bar that allows you to, +you know, search for things. It’s a great way to filter views and zoom into +things when you're looking for that needle in a haystack. + +[discrete] +==== Out-of-the-box {kib} dashboards + +{metricbeat} ships with a variety of pre-built {kib} dashboards that you can +use to visualize metrics about your Kubernetes environment. If these dashboards +are not already loaded into {kib}, you must run the {metricbeat} setup job. + +TIP: To run the setup job, install {metricbeat} on any system that can connect to +the {stack}, enable the modules for the metricsets you want to monitor, then run +the `setup` command. To learn how, see the +{metricbeat-ref}/metricbeat-installation-configuration.html[{metricbeat} quick start]. + +On the Kubernetes overview dashboard, you can see an overview of all the nodes, +deployments, and pods running on your Kubernetes cluster: + +[role="screenshot"] +image::images/k8s-overview.png[Kubernetes overview dashboard] + +You can use these dashboards as they are, or as a starting point for custom +dashboards tailored to your needs. diff --git a/docs/en/observability/monitor-k8s/monitor-k8s-network.asciidoc b/docs/en/observability/monitor-k8s/monitor-k8s-network.asciidoc new file mode 100644 index 0000000000..377be9606a --- /dev/null +++ b/docs/en/observability/monitor-k8s/monitor-k8s-network.asciidoc @@ -0,0 +1,17 @@ +[discrete] +[[monitor-kubernetes-network-traffic]] +== Part 5: Monitor internal network traffic data + +[Author: TBD] + +TODO: provide a brief intro. + +[discrete] +=== Deploy {packetbeat} to capture network traffic data + +TODO: Describe how to configure and deploy Packetbeat as a daemonset. + +[discrete] +=== View network traffic data + +TODO: Describe how to use the pre-built dashboards/visualizations. diff --git a/docs/en/observability/monitor-k8s/monitor-k8s-overview.asciidoc b/docs/en/observability/monitor-k8s/monitor-k8s-overview.asciidoc new file mode 100644 index 0000000000..50869c8d99 --- /dev/null +++ b/docs/en/observability/monitor-k8s/monitor-k8s-overview.asciidoc @@ -0,0 +1,78 @@ +[discrete] +[[kubernetes-monitoring-architecture]] +== Monitoring architecture + +The {stack} provides 4 main components for monitoring Kubernetes: + +1. Lightweight agents, called {beats}, to collect observability data. Some +{beats} include pre-configured data collection modules to ease the collection +and parsing of data for common applications such as Apache, MySQL, and Kafka. + +2. APM (described later) to monitor, detect, and diagnose complex application +performance issues. + +3. {es} for storing and searching your data. + +4. {observability} apps in {kib} for visualizing and managing your observability +data. + +image::images/k8s-monitoring-architecture.png[Kubernetes monitoring architecture] + + +{beats} agents are deployed to Kubernetes as DaemonSets. This deployment +architecture ensures that the agents are available to capture both system and +application-level observability data. + +**{filebeat}**: Collects logs from pods, containers, and applications running on +Kubernetes. + +{filebeat} communicates with the Kubernetes API server to retrieve information +about the pods running on the host, all the metadata annotations, and the +location of the log files. + +When autodiscovery is configured, {filebeat} automatically discovers what +kind of components are running in a pod and applies the logging modules needed +to capture logs for those components. + +**{metricbeat}**: Collects and preprocesses system and service metrics, such as +information about running processes, as well as CPU, memory, disk, and network +utilization numbers. + +Because {metricbeat} runs on each node, it can collect metrics from the Kubelet +API. These metrics provide important information about the state of the +Kubernetes nodes, pods, containers, and other resources. + +For cluster-wide metrics, {metricbeat} accesses the kube-state-metrics +service directly or gets metrics scraped by Prometheus. + +When hints-based autodiscovery is configured, {metricbeat} looks for hints +in Kubernetes pod annotations or Docker labels and launches the proper +configuration to collect application metrics. + + +**Other {beats} (not shown)**: Collect and process other types of data, such as +Uptime data and network traffic. + +[discrete] +[[beats-metadata]] +=== Metadata + +All {beats} agents provide processors for adding metadata to events. The +metadata is valuable for grouping and exploring related data. For example, when +you're analyzing container logs, you want to know the host and container name, +and you want to be able to correlate logs, metrics, and traces. + +The default deployments include processors, when needed, for enriching events +with cloud, Kubernetes, and host metadata. + +image::images/metadata-processors.png[Metadata processors for cloud, Kubernetes, and host metadata] + +Now that you have a basic understanding of the monitoring architecture, let's +learn how to deploy monitoring to your Kubernetes environment. + +[discrete] +== Before you begin + +To monitor Kubernetes, you need {es} for storing and searching your +observability data, and {kib} for visualizing and managing it. For more +information, see <>. diff --git a/docs/en/observability/monitor-k8s/monitor-k8s-uptime.asciidoc b/docs/en/observability/monitor-k8s/monitor-k8s-uptime.asciidoc new file mode 100644 index 0000000000..261b77781a --- /dev/null +++ b/docs/en/observability/monitor-k8s/monitor-k8s-uptime.asciidoc @@ -0,0 +1,17 @@ +[discrete] +[[monitor-kubernetes-uptime]] +== Part 4: Monitor uptime and availability data + +[Author: TBD] + +TODO: Provide a brief intro. + +[discrete] +=== Deploy {heartbeat} to collect uptime and availability data + +TODO: Describe how to configure and deploy Heartbeat as a daemonset. + +[discrete] +=== View uptime and availability in {kib} + +TODO: Describe how to use the Uptime app and pre-built dashboards/visualizations. diff --git a/docs/en/observability/monitor-k8s/monitor-k8s.asciidoc b/docs/en/observability/monitor-k8s/monitor-k8s.asciidoc new file mode 100644 index 0000000000..b6757cdb9d --- /dev/null +++ b/docs/en/observability/monitor-k8s/monitor-k8s.asciidoc @@ -0,0 +1,73 @@ +[[monitor-kubernetes]] += Monitor Kubernetes: Observe the health and performance of your Kubernetes deployments +++++ +Monitor Kubernetes +++++ + +Applications running in a containerized environment like Kubernetes pose a +unique monitoring challenge: how do you diagnose and resolve issues with +hundreds of microservices on thousands (or millions) of containers, running +in ephemeral and disposable pods? + +A successful Kubernetes monitoring solution has a few requirements: + +* Monitors all layers of your technology stack, including: +** The host systems where Kubernetes is running. +** Kubernetes core components, nodes, pods, and containers running within +the cluster. +** All of the applications and services running in Kubernetes containers. + +* Automatically detects and monitors services as they appear dynamically. + +* Provides a way to correlate related data so that you can group and explore +related metrics, logs, and other observability data. + + +[discrete] +== What you’ll learn + +This guide describes how to use Elastic {observability} to observe all layers of +your application, including the orchestration software itself: + +* Collect logs and metrics from Kubernetes and your applications +* Collect trace data from applications deployed with Kubernetes +* Centralize the data in the {stack} +* Explore the data in real-time using tailored dashboards and {observability} UIs + + +This guide describes how to deploy Elastic monitoring agents as DaemonSets using +`kubectl` and the Beats GitHub repository manifest files. For other +deployment options, see the Kubernetes operator and custom resource definitions +from {eck-ref}/index.html[{ecloud} on Kubernetes (ECK)] or the +https://github.com/elastic/helm-charts/blob/master/README.md[Helm charts]. + +include::monitor-k8s-overview.asciidoc[] + +include::monitor-k8s-logs.asciidoc[] + +include::monitor-k8s-metrics.asciidoc[] + +//include::monitor-k8s-uptime.asciidoc[] + +//include::monitor-k8s-network.asciidoc[] + +include::monitor-k8s-application-performance.asciidoc[] + +//include::diagnose-k8s-bottlenecks.asciidoc[] + +[discrete] +== What’s next + +* Want to protect your endpoints from security threats? Try +https://www.elastic.co/security[{elastic-sec}]. Adding endpoint protection is +just another integration that you add to the agent policy! + +* Are your eyes bleary from staring at a wall of screens? +{observability-guide}/create-alerts.html[Create alerts] and find out about +problems while sipping your favorite beverage poolside. + +* Want Elastic to do the heavy lifting? Use machine learning to +{observability-guide}/inspect-log-anomalies.html[detect anomalies]. + +// required for tab widgets +include::{shared}/tab-widget-code/code.asciidoc[] diff --git a/docs/en/observability/tutorials.asciidoc b/docs/en/observability/tutorials.asciidoc index 2e69af8ed8..f6abd9f72b 100644 --- a/docs/en/observability/tutorials.asciidoc +++ b/docs/en/observability/tutorials.asciidoc @@ -7,4 +7,8 @@ collect performance data, and monitor host availability and endpoints. * <>. +* <>. + include::monitor-java-app.asciidoc[] + +include::monitor-k8s/monitor-k8s.asciidoc[leveloffset=+1] diff --git a/docs/en/shared/install-apm-agents-kube/dotnet.asciidoc b/docs/en/shared/install-apm-agents-kube/dotnet.asciidoc index 1d28c6ced4..0e86788960 100644 --- a/docs/en/shared/install-apm-agents-kube/dotnet.asciidoc +++ b/docs/en/shared/install-apm-agents-kube/dotnet.asciidoc @@ -1,35 +1,57 @@ -*Download the APM agent* +NOTE: These instructions are for .NET Core v2.2+. +All other use-cases require downloading the agent from NuGet and adding it to your application. +See {apm-dotnet-ref-v}/setup.html[set up the Agent] for full details. +Once agent set-up is complete, jump to the *Configure the agent* section on this page. -Add the agent packages from https://www.nuget.org/packages?q=Elastic.apm[NuGet] to your .NET application. -There are multiple NuGet packages available for different use cases. +*Use an init container to download and extract the agent* -For an ASP.NET Core application with Entity Framework Core, download the -https://www.nuget.org/packages/Elastic.Apm.NetCoreAll[Elastic.Apm.NetCoreAll] package. -This package will automatically add every agent component to your application. +The .Net agent automatically instruments .NET Core version 2.2 and newer without +any application code changes. +To do this, you'll need an +https://kubernetes.io/docs/concepts/workloads/pods/init-containers/[init container] +that pulls and unzips the latest agent release: -To minimize the number of dependencies, you can use the -https://www.nuget.org/packages/Elastic.Apm.AspNetCore[Elastic.Apm.AspNetCore] package for just ASP.NET Core monitoring, or the -https://www.nuget.org/packages/Elastic.Apm.EntityFrameworkCore[Elastic.Apm.EfCore] package for just Entity Framework Core monitoring. - -If you only want to use the public agent API for manual instrumentation, use the -https://www.nuget.org/packages/Elastic.Apm[Elastic.Apm] package. - -*Add the agent to your application* +[source,yml] +---- + # ... + spec: + volumes: + - name: elastic-apm-agent <1> + emptyDir: {} + initContainers: + - name: elastic-dotnet-agent + image: busybox + command: ["/bin/sh","-c"] <2> + args: <3> + - wget -qO './elastic-apm-agent/ElasticApmAgent.zip' https://github.com/elastic/apm-agent-dotnet/releases/download/1.7.0/ElasticApmAgent_1.7.0.zip; + cd elastic-apm-agent; + cat ElasticApmAgent.zip | busybox unzip -; + volumeMounts: + - mountPath: /elastic-apm-agent + name: elastic-apm-agent +---- +<1> The shared volume. +<2> Runs a shell and executes the provided `args`. +<3> Gets the latest `apm-agent-dotnet` release and saves it to `elastic-apm-agent/ElasticApmAgent.zip`. +Then `cd` into the directory and unzip the file's contents. Don't forget to update the GitHub URL in this +command with the version of the agent you'd like to use. -For an ASP.NET Core application with the `Elastic.Apm.NetCoreAll` package, -call the `UseAllElasticApm` method in the `Configure` method within the `Startup.cs` file: +To connect the agent to your application, point the `DOTNET_STARTUP_HOOKS` environment +variable towards `ElasticApmAgentStartupHook.dll` file that now exists in the +`/elastic-apm-agent` directory of the `elastic-apm-agent` volume. -[source,dotnet] +[source,yml] ---- -public class Startup -{ - public void Configure(IApplicationBuilder app, IHostingEnvironment env) - { - app.UseAllElasticApm(); - //…rest of the method - } - //…rest of the class -} + # ... + containers: + - name: your-app-container + volumeMounts: + - mountPath: /elastic-apm-agent + name: elastic-apm-agent + env: + # ... + - name: DOTNET_STARTUP_HOOKS + value: "/elastic-apm-agent/ElasticApmAgentStartupHook.dll" ---- *Configure the agent* @@ -48,10 +70,13 @@ Configure the agent using environment variables: key: ELASTIC_APM_SECRET_TOKEN <2> - name: ELASTIC_APM_SERVICE_NAME value: "service-name-goes-here" <3> + - name: DOTNET_STARTUP_HOOKS <4> + value: "/elastic-apm-agent/ElasticApmAgentStartupHook.dll" ---- <1> Defaults to `http://localhost:8200` <2> Pass in `ELASTIC_APM_SECRET_TOKEN` from the `apm-secret` keystore created previously <3> Allowed characters: a-z, A-Z, 0-9, -, _, and space +<4> Explained previously and only required when using the no-code instrumentation method. *Learn more in the agent reference* diff --git a/docs/en/shared/install-apm-agents-kube/java.asciidoc b/docs/en/shared/install-apm-agents-kube/java.asciidoc index f79aa47f4d..808c26b072 100644 --- a/docs/en/shared/install-apm-agents-kube/java.asciidoc +++ b/docs/en/shared/install-apm-agents-kube/java.asciidoc @@ -66,11 +66,14 @@ Configure the agent using environment variables: value: "service-name-goes-here" <3> - name: ELASTIC_APM_APPLICATION_PACKAGES value: "org.springframework.samples.petclinic" <4> + - name: JAVA_TOOL_OPTIONS <5> + value: -javaagent:/elastic/apm/agent/elastic-apm-agent.jar ---- <1> Defaults to `http://localhost:8200` <2> Pass in `ELASTIC_APM_SECRET_TOKEN` from the `apm-secret` keystore created previously <3> Allowed characters: a-z, A-Z, 0-9, -, _, and space <4> Used to determine whether a stack trace frame is an _in-app_ frame or a _library_ frame. +<5> Explained previously *Learn more in the agent reference*