Getting Counters into SD #144

bill-within · 2019-08-06T21:23:09Z

Hi,
Is it possible to get Counter-type metrics into SD, high-cardinality or not? I see the Cumulative aggregator, and I could manually create one for each of my individual counters, but that's not exactly ideal.

Prometheus has a Counter:
https://prometheus.io/docs/concepts/metric_types/

StackDriver has the same concept only they call it Cumulative:
https://cloud.google.com/monitoring/api/v3/metrics-details#metric-kinds

& I'd like to just map them 1-to-1 without having to update the stackdriver-prometheus-sidecar config every time I add a new one. I've been using Gauges for everything instead, but that feels a little dirty.

Thanks!

jkohen · 2019-08-06T21:26:13Z

@bill-within counter metrics are supported out of the box. Are you having any issues with that?

The aggregator feature is only intended to reduce cardinality.

bill-within · 2019-08-06T21:29:10Z

@jkohen I am, when I define something as a counter, Prometheus will pick it up OK (adding _total and _created), but it's never visible in Stackdriver. Any Gauges I create come through just fine, however.

bill-within · 2019-08-06T21:32:06Z

& no errors/warnings in the prom or sidecar logs

jkohen · 2019-08-06T21:33:55Z

Can you share your Prometheus and sidecar configurations? It would be useful to see the /metrics page for the metrics that aren't making it to Stackdriver.

Assigning to @bmoyles0117 after initial triage.

bill-within · 2019-08-06T21:45:18Z

You bet!

Relevant config snippet:

kind: ConfigMap
apiVersion: v1
metadata:
  name: prometheus-configmap
  namespace: prometheus
data:
  prometheus.yml: |-
    global:
      scrape_interval: 15s
      evaluation_interval: 15s
    scrape_configs:
...
      - job_name: 'apps'
        kubernetes_sd_configs:
        - role: pod
          namespaces:
            names:
              - dev
              - test
              - sand
        relabel_configs:
        - action: keep
          source_labels: [__meta_kubernetes_pod_container_name]
          regex: (example-service1)
        - action: keep
          source_labels: [__meta_kubernetes_pod_container_port_name]
          regex: (metrics)
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - action: replace
          source_labels: [__meta_kubernetes_namespace]
          target_label: namespace
        - action: replace
          source_labels: [__meta_kubernetes_pod_name]
          target_label: pod_name

Deployment args:

      - name: prometheus
        image: gcr.io/xxx/prometheus:v2.11.1
        volumeMounts:
          - name: prometheus-config-volume
            mountPath: /etc/prometheus/
          - name: prometheus-storage-volume
            mountPath: /prometheus/
        args:
          - --config.file=/etc/prometheus/prometheus.yml
          - --storage.tsdb.path=/prometheus/
...
      - name: sidecar
        image: gcr.io/xxx/stackdriver-prometheus-sidecar:0.4.3
        imagePullPolicy: Always
        args:
          - "--stackdriver.project-id=xxx"
          - "--prometheus.wal-directory=/prometheus/wal"
          - "--stackdriver.kubernetes.location=us-west2"
          - "--stackdriver.kubernetes.cluster-name=k8s-dev-1"

/metrics page snippet:

....
# HELP time_service_calls_total Number of calls to the backend time service
# TYPE time_service_calls_total counter
time_service_calls_total 48.0
# TYPE time_service_calls_created gauge
time_service_calls_created 1.5651240328555365e+09

Metric created in app via:

from prometheus_client import start_http_server, Counter
...
BACKEND_CALLS = Counter('time_service_calls', 'Number of calls to the backend time service')

Visible in Prometheus if I query 'time_service_calls_total':

Element 
time_service_calls_total{app="example-service1",instance="10.52.0.57:9090",job="apps",namespace="dev",pod_name="example-service1-579b595567-scpvp",pod_template_hash="579b595567"} 

Value
48

bmoyles0117 · 2019-08-07T20:41:41Z

Thanks for the configs, I'm going to start investigating now and will report back with what I find.

bmoyles0117 · 2019-08-09T16:47:45Z

Before I dig much deeper, I'm noticing an inconsistency between the metric names you're defining, and the ones you're searching for.

Can you help me understand how BACKEND_CALLS is being translated to two metrics _total and _created? We might be using different versions of python, as when I run this code, I only get a single metric.

from prometheus_client import start_http_server, Counter
import random
import time

BACKEND_CALLS = Counter('time_service_calls', 'Number of calls to the backend time service')

start_http_server(8080)

while True:
  time.sleep(random.random())
  BACKEND_CALLS.inc()

When I curl the endpoint, I get the following, notice the metric names:

$ curl -qs http://127.0.0.1:8080/metrics | grep time_service                                            
# HELP time_service_calls Number of calls to the backend time service
# TYPE time_service_calls counter
time_service_calls 12.0
$ curl -qs http://127.0.0.1:8080/metrics | grep time_service                                            
# HELP time_service_calls Number of calls to the backend time service
# TYPE time_service_calls counter
time_service_calls 17.0

bill-within · 2019-08-09T16:59:39Z

I was surprised by that as well, however this is the first application I've instrumented with Prometheus and it happens for any Counter I create, so I figured that was normal. I'm using the standard python 3.7 from dockerhub (https://hub.docker.com/_/python) and the latest official python prometheus client off pypi (https://pypi.org/project/prometheus_client/). What does your environment look like?

bill-within · 2019-08-09T17:13:55Z

They look like they're coming from here: https://github.com/prometheus/client_python/blob/master/prometheus_client/metrics.py#L255

bmoyles0117 · 2019-08-12T15:23:15Z

@bill-within thanks for the reference, it helped! Seems like with python 3.7, the prometheus libraries are using a new agreed, but undocumented standard for capturing counters with their original creation time. We're not handling this case in the prometheus sidecar, resulting in dropping these metrics. I'm working on a fix now, and will hopefully have it out by the end of the week.

bmoyles0117 · 2019-08-15T01:32:37Z

The patch has been merged into master, @StevenYCChou is the release shepherd for this week's release. I believe you should have an image that you can use by the end of the week. Once we confirm that you're receiving your counter metrics, we can close this issue.

bill-within · 2019-08-15T01:49:33Z

Excellent, thank you @bmoyles0117!

StevenYCChou · 2019-08-15T21:16:26Z

Hi @bill-within , I have released a new version and please take a look:
Github release: https://github.com/Stackdriver/stackdriver-prometheus-sidecar/releases/tag/0.5.0
container image: gcr.io/stackdriver-prometheus/stackdriver-prometheus-sidecar:0.5.0

bill-within · 2019-08-15T21:36:36Z

Looks perfect 👍 thank you @StevenYCChou!

bill-within · 2019-10-16T20:08:36Z

@bmoyles0117 I didn't feel like opening a ticket to ask this question was warranted, but please forgive me if this not the appropriate format for this question: is it possible to run this stackdriver-prometheus-sidecar outside of k8s? Or outside of docker entirely?

I'm going to be scraping metrics from some mixed environments (GCE, EC2, bare metal) and I'd like to continue using the :9090/metrics -> prom -> SD pattern I've grown accustomed to, however these new prometheus instances won't be running in k8s.

jkohen self-assigned this Aug 6, 2019

jkohen added the question Further information is requested label Aug 6, 2019

jkohen assigned bmoyles0117 and unassigned jkohen Aug 6, 2019

bmoyles0117 mentioned this issue Aug 12, 2019

Support counter metrics with the _total suffix. #146

Merged

bill-within closed this as completed Aug 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Counters into SD #144

Getting Counters into SD #144

bill-within commented Aug 6, 2019

jkohen commented Aug 6, 2019

bill-within commented Aug 6, 2019

bill-within commented Aug 6, 2019

jkohen commented Aug 6, 2019

bill-within commented Aug 6, 2019

bmoyles0117 commented Aug 7, 2019

bmoyles0117 commented Aug 9, 2019

bill-within commented Aug 9, 2019

bill-within commented Aug 9, 2019

bmoyles0117 commented Aug 12, 2019

bmoyles0117 commented Aug 15, 2019

bill-within commented Aug 15, 2019

StevenYCChou commented Aug 15, 2019

bill-within commented Aug 15, 2019

bill-within commented Oct 16, 2019

Getting Counters into SD #144

Getting Counters into SD #144

Comments

bill-within commented Aug 6, 2019

jkohen commented Aug 6, 2019

bill-within commented Aug 6, 2019

bill-within commented Aug 6, 2019

jkohen commented Aug 6, 2019

bill-within commented Aug 6, 2019

bmoyles0117 commented Aug 7, 2019

bmoyles0117 commented Aug 9, 2019

bill-within commented Aug 9, 2019

bill-within commented Aug 9, 2019

bmoyles0117 commented Aug 12, 2019

bmoyles0117 commented Aug 15, 2019

bill-within commented Aug 15, 2019

StevenYCChou commented Aug 15, 2019

bill-within commented Aug 15, 2019

bill-within commented Oct 16, 2019