Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match metric type not working as documented #228

Closed
andy-paine opened this issue Jun 5, 2019 · 3 comments · Fixed by #229
Closed

Match metric type not working as documented #228

andy-paine opened this issue Jun 5, 2019 · 3 comments · Fixed by #229
Labels

Comments

@andy-paine
Copy link
Contributor

Expected Behavior

2 statsd metrics with the same name but different types (e.g. counter and timer) can be disambiguated with match_metric_type.

Current Behavior

Statsd exporter reports that the metric is already registered when sending a second metric with the same name but different type.

Steps to Reproduce

  1. Use the following mapping:
mappings:
- match: test.foo.*
  name: "test_foo"
  match_metric_type: counter
  labels:
    provider: "$1"
- match: test.foo.*
  name: "test_foo_timer"
  match_metric_type: timer
  labels:
    provider: "$1"
  1. Run statsd exporter (I used Docker) with the mapping and debug logging enabled
  2. echo "test.foo.example:0.003666|ms" | nc -w 1 -u 0.0.0.0 9125
  3. echo "test.foo.example:1|c" | nc -w 1 -u 0.0.0.0 9125
  4. Step 4 produces the following log line: time="2019-06-05T12:45:07Z" level=debug msg="Failed to update metric \"test_foo_timer\". Error: metric with name test_foo_timer is already registered" source="exporter.go:500" and no metric of type counter is visible in /metrics

Context (Environment)

We are currently emitting metrics for some tasks in both a timer and a counter format so we can count the number of tasks and time each one. To avoid having to change all this metrics code to use unique names for each type, it would be useful if statsd exporter could work out the type from the statsd packet as per docs for match_metric_type.

I've tried dumping the FSM dot file but I'm not hugely familiar with Go or this codebase so not sure exactly what I'm looking for/at. The issue seems to be that the glob expression I am trying to match is repeated but I thought I could disambiguate it using the match_metric_type. The FSM dot diagram shows that the glob pattern is registered on both the counter and the timer branches but the actual behaviour seems to suggest that the statsd metric name is being matched against a glob pattern, before the type is evaluated. Additionally, FSM warns at startup that time="2019-06-05T12:44:55Z" level=warning msg="match \"test.foo.*\" is a super set of match \"test.foo.*\" but in a lower order, the first will never be matched" source="fsm.go:294" - even though the type should disambiguate.

Please let me know if you need any more details.

@andy-paine
Copy link
Contributor Author

andy-paine commented Jun 5, 2019

I've done some investigating think I've narrowed down the issue to the cache.

When checking if a metric is cached in mapper.go only the metric name is checked against the cache, even though metric can have the same name but differ in type.

The solution (temporarily) is to set --statsd.cache-size=0 if the performance hit isn't too bad. I've also managed to get it working locally by making the key that the cache looks up be fmt.Sprintf("%s.%s", statsdMetricType, statsdMetric) which is similar pattern to how Graphite allows you to search for your different metrics (e.g. timers.foo.bar).

How lookups in the cache are done feels like worth having a discussion over but happy to raise a PR with my changes if the above approach seems reasonable.

@SpencerMalone any thoughts?

@matthiasr
Copy link
Contributor

Thank you for the thorough investigation! Yes, I agree that keying on the metric type is necessary. I would prefer passing the metric type into the cache Get method. I think it's okay to change the signature, as far as I know it's not used externally. That way we don't risk different call sites fmt'ing the cache key differently. Please do send a PR!

@SpencerMalone
Copy link
Contributor

I agree, this is a great catch! I had been unaware that this was a feature. Keying on this definitely seems like it'll work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants