-
Notifications
You must be signed in to change notification settings - Fork 28
Core services healthchecks
Nicolas Degory edited this page Sep 8, 2017
·
11 revisions
This page lists the checks that should be run to validate the deployment of the core services.
curl
checks should be run from a container on the overlay network shared by the tested service.
logs
checks can be run from a manager node.
The command should be run after a stabilization delay.
This is implemented in the *.test.yml
of the cluster/agent/stacks stack files (part of the agent image).
order | service | depends on | context | test | command | where (node label) | delay (sec) | timeout (sec) |
---|---|---|---|---|---|---|---|---|
1 | elasticsearch | single mode | cluster health should be yellow | curl -sf elasticsearch:9200/_cluster/health?wait_for_status=yellow&timeout=15s | core | 10 | 30 | |
1 | cluster mode | cluster health should be green | curl -sf elasticsearch:9200/_cluster/health?wait_for_status=green&timeout=30s | core | 15 | 45 | ||
2 | etcd | * | endpoint health | etcdctl --endpoints "http://etcd:2379" endpoint health | grep -qw healthy | core | 10 | 30 | |
3 | nats | * | api availability | curl -sf "nats:8222/subsz" | core | 5 | 20 | |
4 | ampbeat | nats, elasticsearch | docker service logs amp_ampbeat 2>&1 | grep -q "INFO ampbeat is running" | manager | 5 | 20 | |
5 | kibana | elastisearch, ampbeat | UI availability | curl -sf "kibana:5601/app/kibana#/discover" | core | 10 | 30 | |
6 | node_exporter | metrics URL | curl -sf "node-exporter:9100/metrics" | metrics | 5 | 20 | ||
7 | nats_exporter | nats | metrics URL | curl -sf "nats-exporter:7777/metrics" | metrics | 5 | 20 | |
8 | haproxy_exporter | haproxy | metrics URL | curl -sf "haproxy-exporter:9101/metrics" | metrics | 5 | 20 | |
9 | prometheus | node_exporter, nats_exporter, haproxy_exporter | status URL | curl -sf "prometheus:9090/status" | metrics | 8 | 30 | |
10 | alertmanager | prometheus | metrics URL | curl -sf "alertmanager:9093/metrics" | metrics | 5 | 20 | |
11 | grafana | org info | curl -sf "grafana:3000/api/org" | metrics | 10 | 30 | ||
12 | proxy | stats | curl -sf "http://stats:stats@proxy:1936/haproxy\?stats\;csv" | route | 5 | 20 | ||
13 | amplifier | nats, elasticsearch, etcd | metrics URL | curl -sf "amplifier:5100/metrics" | core | 6 | 30 | |
14 | gateway | amplifier | docker service logs amp_gateway 2>&1 | grep -q "gateway successfully initialized" | manager | 5 | 30 | |
15 | agent | nats | docker service logs amp_agent 2>&1 | grep -q "Connected to nats streaming successfully" | manager | 5 | 30 | |
16 | portal | gateway | home page | curl -sf "http://portal/" | core | 5 | 30 |