-
-
Notifications
You must be signed in to change notification settings - Fork 0
Monitoring
benoit74 edited this page Sep 4, 2023
·
7 revisions
The technical monitoring of our infrastructure is based on:
- UpTime Robot for external monitoring of our web properties
- Grafana for monitoring of our servers
We use a Free Grafana Cloud instance. Our Grafana Cloud instance is https://kiwixorg.grafana.net/. This instance is configured only for k8s logs and metrics.
Configuration has been done based on https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/configuration/config-k8s-agent-flow/.
Configuration is deployed via Helm, see https://github.com/kiwix/k8s/tree/main/grafana
Architecture:
- Grafana Cloud provides us:
- a Grafana instance displaying dashboards
- a Prometheus instance: scrape / store metrics + respond to queries
- a Loki instance : store logs + respond to queries
- We host in our
grafana
namespace:- kube-state-metrics (deployment) : service that listens to the Kubernetes API server and generates metrics about the state of the objects
- opencost (deployment): measures infrastructure costs
- prometheus-operator-crd (not used yet): operator to configure Prometheus based on k8s resources
- prometheus-node-exporter (daemonset) : running on each k8s node, grabs metrics at the node level
- grafana-agent (statefulset): agent grabing metrics (from kube-state-metrics, node-exporter, kubelet, cadvisor, opencost) and sending them to Prometheus
- grafana-agent-logs (daemonset): same binary as above, but grabing logs (Pods + Cluster events) and sending them to Loki
Grafana agent is installed in Flow Mode configuration.
ToDo