-
Notifications
You must be signed in to change notification settings - Fork 28
Core services healthchecks
This page lists the checks that should be run to validate the deployment of the core services.
curl
checks should be run from a container on the overlay network shared by the tested service.
logs
checks can be run from a manager node.
The check should be tried until the timeout expires.
order | service | depends on | context | test | command | where (node label) | timeout (sec) |
---|---|---|---|---|---|---|---|
1 | elasticsearch | single mode | cluster health should be yellow | curl -sf elasticsearch:9200/_cluster/health?wait_for_status=yellow&timeout=15s | core | 15 | |
1 | cluster mode | cluster health should be green | curl -sf elasticsearch:9200/_cluster/health?wait_for_status=green&timeout=30s | core | 30 | ||
2 | etcd | * | endpoint health | etcdctl --endpoints "http://etcd:2379" endpoint health | grep -qw healthy | core | 10 | |
3 | nats | * | api availability | curl -sf "nats:8222/subsz" | core | 10 | |
4 | ampbeat | nats, elasticsearch | docker service logs amp_ampbeat 2>&1 | grep -q "INFO ampbeat is running" | manager | 5 | |
5 | kibana | elastisearch, ampbeat | UI availability | curl -sf "kibana:5601/app/kibana#/discover" | core | 15 | |
6 | node_exporter | metrics URL | curl -sf "node-exporter:9100/metrics" | metrics | 5 | ||
7 | nats_exporter | nats | metrics URL | curl -sf "nats-exporter:7777/metrics" | metrics | 5 | |
8 | haproxy_exporter | haproxy | metrics URL | curl -sf "haproxy-exporter:9101/metrics" | metrics | 5 | |
9 | prometheus | node_exporter, nats_exporter, haproxy_exporter | status URL | curl -sf "prometheus:9090/status" | metrics | 10 | |
20 | alertmanager | prometheus | metrics URL | curl -sf "alertmanager:9093/metrics" | metrics | 10 |
11 |grafana | | | org info | curl -sf "grafana:3000/api/org" | metrics | 10 12 |proxy | | | stats | curl -sf "http://stats:stats@proxy:1936/haproxy\?stats\;csv" | route | 10 13 |amplifier | nats, elasticsearch, etcd | | metrics URL | curl -sf "amplifier:5100/metrics" | core | 10 14 |gateway | amplifier | | docker service logs amp_gateway 2>&1 | grep -q "gateway successfully initialized" | manager | 5 15 |agent | nats | | docker service logs amp_agent 2>&1 | grep -q "Connected to nats streaming successfully" | manager | 5 16 |portal | gateway | | home page | curl -sf "http://portal/" | core | 5