Skip to content

Commit

Permalink
Added Kube manifests for GCS-enabled setup + store. Added small tutor…
Browse files Browse the repository at this point in the history
…ial. (thanos-io#61)

* Added Kube manifests for GCS-enabled setup + store.

Signed-off-by: Bartek Plotka <[email protected]>

* Addressed issues.

Signed-off-by: Bartek Plotka <[email protected]>
  • Loading branch information
bwplotka authored and fabxc committed Nov 16, 2017
1 parent 30b6221 commit 18647d5
Show file tree
Hide file tree
Showing 12 changed files with 235 additions and 60 deletions.
1 change: 0 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,4 @@ LABEL maintainer="The Thanos Authors"

COPY thanos /bin/thanos

USER nobody
ENTRYPOINT [ "/bin/thanos" ]
60 changes: 41 additions & 19 deletions kube/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,51 @@ To use cluster from your terminal do:
`source ./kube/envs.sh`

From now on you can use `kubectl` as well as `minikube` command, including `minikube stop` to stop the whole cluster.

## Start Thanos service for Thanos gossip peers

This allows query to discover thanos services.
## Example setup.

```bash
echo "Starting Thanos service for gathering all thanos gossip peers."
kubectl apply -f manifests/thanos
This directory covers are required k8s manifest to start example setup that will include:
- Thanos headless service for discovery purposes.
- Prometheus + Thanos sidecar.
- Thanos query node

```

## Start Prometheus with Thanos sidecar
This setup will have GCS upload disabled, but will show how we can proxy requests from Prometheus.

This example can be easily extended to show the HA Prometheus use case. (TODO)

To run example setup:
1. `bash kube/apply-example.sh`

You will be now able to reach Prometheus on http://prometheus.default.svc.cluster.local:9090/graph
And Thanos Query UI on http://thanos-query.default.svc.cluster.local:19099/graph

Thanos Query UI should show exactly the same data as Prometheus.

To tear down example setup:
1. `bash kube/delete-example.sh`

## Long term storage setup

This example is running setup that is supposed to upload blocks to GCS for long term storage. This setup includes:
- Thanos headless service for discovery purposes.
- Prometheus + Thanos sidecar with GCS shipper configured
- Thanos query node
- Thanos store node.

To run example setup:
1. Create GCS bucket in your GCP project. Either name it "thanos-test" or put its name into
* manifest/prometheus-gcs/deployment.yaml inside `"--gcs.bucket` flag.
* manifest/thanos-query/deployment.yaml inside `"--gcs.bucket` flag.
2. Create service account that have permission to this bucket
3. Download JSON credentials for service account and run: `kubectl create secret generic gcs-credentials --from-file=<your-json-file>`
4. Run `bash kube/apply-lts.sh`

```bash
echo "Starting Prometheus pod with sidecar."
kubectl apply -f kube/manifests/prometheus
```
You will be now able to reach Prometheus on http://prometheus-gcs.default.svc.cluster.local:9090/graph
And Thanos Query UI on http://thanos-query.default.svc.cluster.local:19099/graph

## Start query node targeting Prometheus sidecar
Thanos Query UI should show exactly the same data as Prometheus, but also older data if it's running longer that 12h.

```bash
echo "Starting Thanos query pod targeting sidecar."
kubectl apply -f kube/manifests/thanos-query
```
After 3h sidecar should upload first block to GCS. You can make that quicker by changing prometheus `storage.tsdb.{min,max}-block-duration` to smaller value (e.g 20m)

You can invoke `bash kube/apply-example.sh` that will do all these steps.
To tear down example setup:
1. `bash kube/delete-lts.sh`
19 changes: 19 additions & 0 deletions kube/apply-lts.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/env bash

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

source ${DIR}/envs.sh

cd ${DIR}

echo "Starting Thanos service for gathering all thanos gossip peers."
kubectl apply -f manifests/thanos

echo "Starting Prometheus pod with sidecar."
kubectl apply -f manifests/prometheus-gcs

echo "Starting Thanos query pod targeting sidecar."
kubectl apply -f manifests/thanos-query

echo "Starting Thanos query pod targeting sidecar."
kubectl apply -f manifests/thanos-store
11 changes: 11 additions & 0 deletions kube/delete-lts.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env bash

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

source ${DIR}/envs.sh

cd ${DIR}
kubectl delete -f manifests/thanos
kubectl delete -f manifests/prometheus-gcs
kubectl delete -f manifests/thanos-query
kubectl delete -f manifests/thanos-store
31 changes: 0 additions & 31 deletions kube/manifests/local/deployment.yaml

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus
name: prometheus-gcs
data:
prometheus.yaml: |
global:
Expand Down
76 changes: 76 additions & 0 deletions kube/manifests/prometheus-gcs/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: prometheus-gcs
labels:
app: prometheus-gcs
thanos-peer: "true"
spec:
replicas: 1
template:
metadata:
labels:
app: prometheus-gcs
thanos-peer: "true"
spec:
containers:
- image: prom/prometheus:v2.0.0
args: [
"--config.file=/etc/prometheus/config/prometheus.yaml",
"--storage.tsdb.path=/data",
"--storage.tsdb.min-block-duration=2h",
"--storage.tsdb.max-block-duration=2h",
"--storage.tsdb.retention=12h"
]
name: prometheus
resources:
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus/config
- name: tsdb-data
mountPath: /data
# To use your own thanos image you need to do `eval $(minikube docker-env) && make docker` and place thanos/thanos:latest here.
# TODO(bplotka): With vm-driver=none, even that is uses same docker, images are not recognized. Investigate.
- image: bplotka/thanos:latest
imagePullPolicy: Always
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /creds/gcs-credentials.json
args: [
"sidecar",
"--log.level=debug",
"--debug.name=sidecar",
"--api-address=0.0.0.0:19090", # This address will be properly deduced and propagated via cluster.PeerState.
"--metrics-address=0.0.0.0:19190",
"--prometheus.url=http://localhost:9090",
"--tsdb.path=/data",
"--cluster.address=0.0.0.0:19390",
"--cluster.peers=thanos.default.svc.cluster.local:19390",
# This is required to be added in GCS prior startup of this.
"--gcs.bucket=thanos-test"
]
name: thanos
resources:
requests:
cpu: 100m
memory: 50Mi
volumeMounts:
- name: tsdb-data
mountPath: /data
- name: gcs-credentials
mountPath: /creds/
volumes:
- name: config-volume
configMap:
name: prometheus-gcs
- name: tsdb-data
emptyDir: {}
- name: gcs-credentials
secret:
defaultMode: 420
# gcs-credentials secret with single file gcs-credentials.json is required.
secretName: gcs-credentials
terminationGracePeriodSeconds: 300
23 changes: 23 additions & 0 deletions kube/manifests/prometheus-gcs/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
apiVersion: v1
kind: Service
metadata:
labels:
app: prometheus-gcs
name: prometheus-gcs
spec:
externalTrafficPolicy: Cluster
ports:
- port: 9090
protocol: TCP
targetPort: 9090
name: http-prometheus
- port: 19190
protocol: TCP
targetPort: 19190
name: http-sidecar-metrics
selector:
app: prometheus-gcs
sessionAffinity: None
type: NodePort
status:
loadBalancer: {}
4 changes: 3 additions & 1 deletion kube/manifests/prometheus/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ spec:
args: [
"--config.file=/etc/prometheus/config/prometheus.yaml",
"--storage.tsdb.path=/data",
"--storage.tsdb.min-block-duration=2h",
"--storage.tsdb.max-block-duration=2h",
]
name: prometheus
resources:
Expand All @@ -43,7 +45,7 @@ spec:
"--tsdb.path=/data",
"--cluster.address=0.0.0.0:19390",
"--cluster.peers=thanos.default.svc.cluster.local:19390",
# "--gcs.bucket=<bucket>" Add bucket and service account to enable shipping blocks to GCS.
# This an example of running sidecar without shipping data to GCS.
]
name: thanos
resources:
Expand Down
54 changes: 54 additions & 0 deletions kube/manifests/thanos-store/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: thanos-store
labels:
app: thanos-store
thanos-peer: "true"
spec:
replicas: 1
template:
metadata:
labels:
app: thanos-store
thanos-peer: "true"
spec:
containers:
# To use your own thanos image you need to do `eval $(minikube docker-env) && make docker` and place thanos/thanos:latest here.
# TODO(bplotka): With vm-driver=none, even that is uses same docker, images are not recognized. Investigate.
- image: bplotka/thanos:latest
imagePullPolicy: Always
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /creds/gcs-credentials.json
args: [
"store",
"--log.level=debug",
"--debug.name=store",
"--api-address=0.0.0.0:19090", # This address will be properly deduced and propagated via cluster.PeerState.
"--metrics-address=0.0.0.0:19190",
"--tsdb.path=/data",
"--cluster.address=0.0.0.0:19390",
"--cluster.peers=thanos.default.svc.cluster.local:19390",
"--gcs.bucket=thanos-test"
]
name: thanos
resources:
requests:
cpu: 200m
memory: 200Mi
volumeMounts:
- mountPath: /creds/
name: gcs-credentials
readOnly: true
- name: tsdb-data
mountPath: /data
volumes:
- name: gcs-credentials
secret:
defaultMode: 420
# gcs-credentials secret with single file gcs-credentials.json is required.
secretName: gcs-credentials
- name: tsdb-data
emptyDir: {}
terminationGracePeriodSeconds: 300
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,17 @@ apiVersion: v1
kind: Service
metadata:
labels:
app: prometheus
name: prometheus
app: thanos-store
name: thanos-store
spec:
clusterIP: 10.0.0.88
externalTrafficPolicy: Cluster
ports:
- nodePort: 30600
port: 9090
- port: 19190
protocol: TCP
targetPort: 9090
targetPort: 19190
name: http-store-metrics
selector:
app: prometheus
app: thanos-query
sessionAffinity: None
type: NodePort
status:
Expand Down
1 change: 1 addition & 0 deletions pkg/cluster/stores.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ func (s *StoreSet) Update(ctx context.Context) {
level.Warn(s.logger).Log("msg", "dialing connection failed; skipping", "store", addr, "err", err)
continue
}
level.Debug(s.logger).Log("msg", "successfully made grpc connection", "store", addr)
store := &storeInfo{conn: conn}
s.stores[addr] = store
}
Expand Down

0 comments on commit 18647d5

Please sign in to comment.