I decided to deploy Prometheus and other stuff using Docker Compose. I'm just doing homework so Docker is affordable in my case. For Production environment i would rather stick to Podman or CRI-O or something like that. (Security issues and non-root privileges escalations and so on, you know)
Let's pretend you can ssh to a Linux machine with Docker Engine and Docker Compose installed and docker systemd service is configured and enabled and you don't have to use sudo to launch containters and the following command gives you the proper output:
docker-compose --version
git clone https://github.com/ashliakhov/prom_homework.git
cd prom_homework
docker-compose up -d
The content of the Docker Compose file is as follows:
# docker-compose.yml
version: '3'
services:
prometheus:
image: prom/prometheus:latest # We will NOT pull tag:latest in PROD before we check the changelog and TEST it before for sure
container_name: prometheus
volumes:
- ./config/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- ./config/prometheus/alertrules.yml:/etc/prometheus/alertrules.yml
- ./config/prometheus/targets.json:/etc/prometheus/targets.json
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
restart: always # To restart a container after reboot or on failure
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
ports:
- '9100:9100'
restart: always
We can check if everything is up and running:
docker ps
curl localhost:9090
Prometheus scrapes Node exporter metrics which is defined in ./config/prometheus/prometheus.yml (File-based service discovery is used)
# ./config/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- '/etc/prometheus/alertrules.yml'
scrape_configs:
- job_name: 'node_exporter'
file_sd_configs: # File-based service discovery config
- files:
- '/etc/prometheus/targets.json'
Scraped targets are described in a JSON file:
# ./config/prometheus/targets.json
[
{
"labels": {
"job": "node_exporter"
},
"targets": [
"localhost:9100"
]
}
]
and available @ http://localhost:9090/targets
Alert rule is defined in ./config/prometheus/alertrules.yml
# ./config/prometheus/alertrules.yml
groups:
- name: Node Exporter metrics
rules:
- alert: Node Exporter is down
expr: up{job="node_exporter"} == 0 # Job called "node_exporter" is not running
for: 1m
labels:
severity: CRITICAL
annotations:
title: Node {{ $labels.instance }} is down
description: Failed to scrape {{ $labels.job }} on {{ $labels.instance }} for more than 1 minute.
Alerts info is available @ http://localhost:9090/alerts