Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kubernetes monitoring docs #151

Merged
merged 47 commits into from
Jan 13, 2021
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
67dbb71
Add draft outline for Kubernetes monitoring guide
dedemorton Sep 9, 2020
6d27847
Add changes from the review
dedemorton Sep 24, 2020
4a762dd
Create separate files
dedemorton Sep 24, 2020
96dd952
Add files
dedemorton Sep 24, 2020
16ea901
Make APM section owner TBD
dedemorton Sep 29, 2020
7148360
Assign "monitor logs" section to ChrsMark
dedemorton Sep 29, 2020
0de1caf
Assign metric monitoring section to ChrsMark
dedemorton Sep 29, 2020
1cb0db2
Merge branch 'master' into k8s_monitoring_docs
dedemorton Oct 8, 2020
238206d
Add logs section (#154)
ChrsMark Oct 12, 2020
521d88c
Add metrics part (#163)
ChrsMark Oct 12, 2020
da25ff8
Add discrete tag to fix build issues
dedemorton Oct 12, 2020
599b77c
Edits in progress
dedemorton Oct 15, 2020
8b464dd
Remove link to topic that no longer exists
dedemorton Oct 15, 2020
c6df8df
More edits
dedemorton Oct 27, 2020
7004542
Merge branch 'master' into k8s_monitoring_docs
dedemorton Oct 27, 2020
efda696
In-progress changes
dedemorton Oct 28, 2020
1c82a48
save changes
dedemorton Oct 28, 2020
fd5c653
Add more edits and fixes from testing
dedemorton Dec 4, 2020
90e7e69
Fix link
dedemorton Dec 4, 2020
3b8e75e
Fix broken doc build
dedemorton Dec 4, 2020
1d29ae2
Fix issues found during testing
dedemorton Dec 5, 2020
7e041c4
Add fixes from testing metrics monitoring
dedemorton Dec 8, 2020
2bbef81
Change from numbered list to sections
dedemorton Dec 8, 2020
2c76845
Add draft overview and architecture digram
dedemorton Dec 10, 2020
fb62658
Fix mistake in diagram
dedemorton Dec 10, 2020
a4c7887
Clean up icons
dedemorton Dec 10, 2020
c5ba1b4
Clean up topics
dedemorton Dec 11, 2020
8054dd1
Remove comment markers
dedemorton Dec 11, 2020
a340c8a
Move before you begin section
dedemorton Dec 11, 2020
9967a76
Add feedback from ChrsMark
dedemorton Dec 15, 2020
8b5bdbe
Update docs/en/observability/monitor-k8s/monitor-k8s-logs.asciidoc
dedemorton Dec 16, 2020
29e5269
Resolve review feedback from bmorelli25
dedemorton Dec 16, 2020
edab83c
Resolve review feedback from masci and fix para about dashboards
dedemorton Dec 16, 2020
fbe7427
Add APM to diagram
dedemorton Dec 19, 2020
54b780c
Apply suggestions from code review
dedemorton Dec 19, 2020
7f562cf
Apply suggestions from DanRoscigno
dedemorton Dec 21, 2020
255c8f8
Add changes from danroscigno's review
dedemorton Dec 22, 2020
9928e3e
docs: Add basic APM on Kubernetes steps (#301)
bmorelli25 Dec 22, 2020
cea2424
Merge branch 'master' of github.com:elastic/observability-docs into k…
bmorelli25 Dec 23, 2020
10a4ea8
docs: feedback from dan
bmorelli25 Dec 23, 2020
2b2fb98
Update docs/en/observability/monitor-k8s/monitor-k8s-application-perf…
bmorelli25 Jan 4, 2021
b975019
Update secret token based on Eyal's feedback
bmorelli25 Jan 4, 2021
aeb03fc
Update docs/en/observability/monitor-k8s/monitor-k8s-application-perf…
bmorelli25 Jan 4, 2021
81a9ed6
docs: add no-code instrumentation for dotnet
bmorelli25 Jan 5, 2021
0cf72ab
Apply additional suggestions from DanRoscigno
dedemorton Jan 12, 2021
9817904
Apply suggestions from code review
dedemorton Jan 13, 2021
dea74c3
Remove outdated todos
dedemorton Jan 13, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/en/observability/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -64,4 +64,4 @@ include::create-alerts.asciidoc[leveloffset=+1]

include::fields-reference.asciidoc[leveloffset=+1]

include::tutorials.asciidoc[]
include::tutorials.asciidoc[]
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[discrete]
== Part 6: Diagnose bottlenecks and other issues

[Author: TBD? PM?]

TODO: Describe how to explore a real problem by navigating
observability UIs and dashboards. This section should showcase the power of
using our observability solution (being able to correlate logs, metrics, and
traces to solve a specific, real-world problem). The section title needs to
match whatever scenario we decide to discuss.
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
[discrete]
bmorelli25 marked this conversation as resolved.
Show resolved Hide resolved
[[monitor-kubernetes-application-performance]]
== Part 5: Monitor application performance

[Author: TBD]

TODO: Describe how to use APM to monitor applications.

[discrete]
=== Set up APM Server

TODO: Describe how to set up APM server.

[discrete]
==== Through ECK

Question: Are we sure we want to cover ECK here? Can we point to the ECK docs
instead? If we try to document all the ways in all the sections, I think users
might get confused.

bmorelli25 marked this conversation as resolved.
Show resolved Hide resolved
[discrete]
==== On cloud

TODO: Describe how to set up APM server on cloud.

[discrete]
==== Download and install

TODO: Describe how to download and install APM server from archives.

[discrete]
=== Set up APM Agents

TODO: Describe how to set up the agents:

Question: Can we show the setup for one type of agent, then point to related
docs for other agents?

* Java agent (see https://www.elastic.co/blog/using-elastic-apm-java-agent-on-kubernetes-k8s)
* NodeJS Agent
* Python Agent
* ... and so forth


[discrete]
=== Configure

TODO: Describe how to add Kubernetes data to events by adding environment
variables to the K8s pod spec.

Question: Is there a more descriptive title that we can use for this section?
"Configure" seems a bit vague. By reading the docs, it sounds like you sometimes
need to add these variables, but it's not clear when/why you add them.
319 changes: 319 additions & 0 deletions docs/en/observability/monitor-k8s/monitor-k8s-logs.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,319 @@
[discrete]
[[monitor-kubernetes-logs]]
== Part 1: Monitor logs

[Author: @ChrsMark]

TODO: Provide a brief intro. We'll need to decide how much to cover here vs
the monitoring overview. Again lots of good introductory info here:
https://www.elastic.co/blog/kubernetes-observability-tutorial-k8s-log-monitoring-and-analysis-elastic-stack)


Collecting and analysing logs of both Kubernetes Core components and various applications running
on top of Kubernetes is a powerful tool for Kubernetes Observability.
Containers running within Kubernetes pods produce logs as stdout or stderr. These logs are written to a
location known to Kubelet.
All we need in order to collect pod logs is Filebeat running as DaemonSet in our Kubernetes cluster.
Filebeat can be configured to communicate with the Kubernetes API server, get the list of pods running on the
current host, and collect the logs the pods are producing. Those logs are annotated with all the relevant
Kubernetes metadata, such as pod id, container name, container labels and annotations and so on.


TODO: Completely remove?
[discrete]
=== Logs you should monitor

TODO: Discuss the various logs (Kubernetes cluster, nodes, host, container logs)
that users might want to collect. I'm not sure if this deserves a separate
section or should be part of the previous section.

[discrete]
=== Deploy {filebeat} to collect logs

TODO: Describe how to configure and deploy Filebeat as a daemonset. We should
walk users through the parts of the deployment YAMLs, explain what they do, and point
out the settings they might want to change or add. Rough list of what
we need to cover:

Filebeat should be deployed and running as one instance per Kubernetes host communicating with Kubernetes API
server retrieving the information about the pods running on that host,
all the metadata annotations as well as the location of the log files. In order to deploy Filebeat on our Kubernetes
cluster and start collecting logs we can follow these steps:


* Download the Filebeat deployment manifest

To download the manifest file, run:

["source", "sh", subs="attributes"]
------------------------------------------------
curl -L -O https://raw.githubusercontent.com/elastic/beats/master/deploy/kubernetes/filebeat-kubernetes.yaml
------------------------------------------------

* Set the connection information for Elasticsearch (also describe how to create
secrets)

By default, Filebeat sends events to an existing Elasticsearch deployment,
if present. To specify a different destination, change the following parameters
in the manifest file:

[source,yaml]
------------------------------------------------
- name: ELASTICSEARCH_HOST
value: elasticsearch
- name: ELASTICSEARCH_PORT
value: "9200"
- name: ELASTICSEARCH_USERNAME
value: elastic
- name: ELASTICSEARCH_PASSWORD
value: changeme
------------------------------------------------

Those settings can be consumed by Kubernetes secret as well:

Create the Secret
["source", "sh", subs="attributes"]
------------------------------------------------
$ echo -n 'changeme' | base64
Y2hhbmdlbWU=
$ kubectl create secret generic es-secret --from-literal='password=Y2hhbmdlbWU='
------------------------------------------------


Use the secret value in Filebeat's env:
[source,yaml]
------------------------------------------------
env:
- name: ELASTICSEARCH_PASSWORD
valueFrom:
secretKeyRef:
name: es-secret
key: password
------------------------------------------------

* Configure log collection:

** Collect Kubernetes logs (cluster, node)
TODO: Do we actually want to mention this? I don't think we have a spesific module to do this
and only `journalbeat` comes to mind for collecting `kubelet`'s logs for instance.

** Collect host logs

** Collect container logs (use inputs)
We should probably explain what the default config with the container input does
then focus on documenting how to use autodiscovery.

To collect container logs, each Filebeat needs access to the local log's path which is
actually a log directory mounted from the host:
[source,yaml]
------------------------------------------------
filebeat.inputs:
- type: container
paths:
- /var/log/containers/*.log
dedemorton marked this conversation as resolved.
Show resolved Hide resolved
------------------------------------------------

With this configuration Filebeat will be able to collect logs from all the files that exist
under /var/log/containers/ directory.


* Add metadata to events. Enrich the event with metadata coming from Docker,
Kubernetes, host, and the cloud providers.

Collecting logs from containers is good, however adding metadata on these logs is really powerful.
In order to add Kubernetes and container related metadata to the logs we need to expand the previous configuration:

[source,yaml]
------------------------------------------------
filebeat.inputs:
- type: container
paths:
- /var/log/containers/*.log
processors:
- add_kubernetes_metadata:
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: "/var/log/containers/"
------------------------------------------------

** Collect container logs with Autodiscovery

We already managed to collect containers' logs and enrich them with metadata. However we can take it
further by leveraging Autodiscover mechanism. Filebeat with Autodiscover can be used so as to discover what kind
of components are running in the pod and can decide which logging module to apply to the logs it is processing.

Autodiscover can be configured with static templates:
[source,yaml]
------------------------------------------------
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
templates:
- condition:
equals:
kubernetes.labels.app: "nginx"
config:
- module: nginx
fileset.stdout: access
fileset.sterr: error
------------------------------------------------
With this template we can identify any Pod labeled as nginx and start collecting its logs using the respective module.

This is good but requires knowledge of the workloads we are running and each time we want to monitor something new
we need to re-configure and restart Filebeat. To avoid this we can leverage hints based autodiscover:

[source,yaml]
------------------------------------------------
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
hints.default_config:
type: container
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
------------------------------------------------

and then annotate the Pods accordingly:

[source,yaml]
------------------------------------------------
apiVersion: v1
kind: Pod
metadata:
name: nginx-autodiscover
annotations:
co.elastic.logs/module: nginx
co.elastic.logs/fileset.stdout: access
co.elastic.logs/fileset.stderr: error
------------------------------------------------

With this setup Filebeat will be able to identify the nginx app and start collecting its logs using nginx module.

* (optional) Drop unwanted events
We can enrich our configuration with additional processors so as to drop unwanted events:
[source,yaml]
------------------------------------------------
processors:
- drop_event:
when:
- equals:
kubernetes.container.name: "metricbeat"
------------------------------------------------

* Enrich with cloud metadata and host metadata
Additionally we can add more metadata to the events by adding the proper processors:

[source,yaml]
------------------------------------------------
processors:
- add_cloud_metadata:
- add_host_metadata:
------------------------------------------------

* Deploy Filebeat as a DaemonSet on Kubernetes

** Running Filebeat on master nodes

Kubernetes master nodes can use https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/[taints]
to limit the workloads that can run on them. To run {agent} on master nodes you may need to
update the Daemonset spec to include proper tolerations:

[source,yaml]
------------------------------------------------
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
------------------------------------------------

** Deploy

To deploy Filebeat to Kubernetes, run:

["source", "sh", subs="attributes"]
------------------------------------------------
kubectl create -f filebeat-kubernetes.yaml
------------------------------------------------

To check the status, run:

["source", "sh", subs="attributes"]
------------------------------------------------
$ kubectl --namespace=kube-system get ds/filebeat

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE-SELECTOR AGE
filebeat 32 32 0 32 0 <none> 1m
------------------------------------------------

Log events should start flowing to Elasticsearch.


** Red Hat OpenShift configuration

If you are using Red Hat OpenShift, you need to specify additional settings in
the manifest file and enable the container to run as privileged.

. Modify the `DaemonSet` container spec in the manifest file:
+
[source,yaml]
-----
securityContext:
runAsUser: 0
privileged: true
-----

. Grant the `filebeat` service account access to the privileged SCC:
+
[source,shell]
-----
oc adm policy add-scc-to-user privileged system:serviceaccount:kube-system:filebeat
-----
+
This command enables the container to be privileged as an administrator for
OpenShift.

. Override the default node selector for the `kube-system` namespace (or your
custom namespace) to allow for scheduling on any node:
+
[source,shell]
----
oc patch namespace kube-system -p \
'{"metadata": {"annotations": {"openshift.io/node-selector": ""}}}'
----
+
This command sets the node selector for the project to an empty string. If you
don't run this command, the default node selector will skip master nodes.


[discrete]
=== View logs in {kib}

==== Using the Logs app in Kibana
dedemorton marked this conversation as resolved.
Show resolved Hide resolved

TODO: Describe how to view logs in Kibana. Show how to use the Logs app and how to
set up and view pre-built dashboards and visualizations.

The https://www.elastic.co/log-monitoring[Logs app] in Kibana allows you to search, filter and tail all the logs
collected into Elastic Stack. Instead of having to ssh into different servers, having to cd into the directory
and tail individual files, all the logs are available in one tool under Logs app.

* Check out filtering logs using the keyword or plain text search.
* You can move back and forth in time using the time picker or the timeline view on the side.
* If you just want to watch the logs update in front of you tail -f style, click on the Streaming button
and use highlighting to accentuate that important bit of the info you are waiting to see.

TODO: Add screenshot here?


==== Out-of-the-box Kibana visualisations
dedemorton marked this conversation as resolved.
Show resolved Hide resolved

When we run the filebeat-setup job, among other things, it pre-created a set of
https://www.elastic.co/guide/en/beats/filebeat/7.8/view-kibana-dashboards.html[out-of-the-box dashboards] in Kibana.
Once our sample petclinic application is finally deployed, we can navigate to the out of the box Filebeat
dashboards for MySQL, NGINX and see that Filebeat modules not only capture logs but can also capture metrics
that the components log. Enabling these visualisations requires running MySQL and NGINX components
of the example application.
Loading