-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] move telemetry and expiring registry to separate packages #291
Conversation
b0784f6
to
2efe64a
Compare
2efe64a
to
b528b4b
Compare
Signed-off-by: glightfoot <[email protected]>
b528b4b
to
871f6ad
Compare
Can you expand on where you see this might be useful? |
Sure! In this case, it was an internal exporter checking on external resources that can appear and disappear over longer timeframes. Without the expiring registry, stale metrics were served until the exporter was restarted. I chose to adapt this instead of reinventing the wheel since it did what I needed. |
I also want to use this in the graphite exporter. it has the same needs but
a completely separately evolved implementation
…On Wed, 29 Jan 2020, 01:43 glightfoot, ***@***.***> wrote:
Sure! In this case, it was an internal exporter checking on external
resources that can appear and disappear over longer timeframes. Without the
expiring registry, stale metrics were served until the exporter was
restarted. I chose to adapt this instead of reinventing the wheel since it
did what I needed.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#291?email_source=notifications&email_token=AABAEBQ6UVJ5HU7VXZDLEB3RADGJTA5CNFSM4KMZN2B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKFRT3I#issuecomment-579541485>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABAEBT7QXNRECEDZHTH2EDRADGJTANCNFSM4KMZN2BQ>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for the long delay, work came down like a hammer over the last weeks and I could not find a long enough block of time to think this through.
It looks pretty good already! I have some thoughts on the entanglement with the mapper defaults, I put those inline. What do you think about keeping the expiring registry unaware of the fact that mapping with defaults is happening?
@@ -62,7 +63,7 @@ func (b *Exporter) Listen(e <-chan Events) { | |||
for { | |||
select { | |||
case <-removeStaleMetricsTicker.C: | |||
b.registry.removeStaleMetrics() | |||
b.registry.RemoveStaleMetrics() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be part of a maintenance routine internal to the expiring registry? That way we would not have to rely on the library users to do it regularly.
@@ -159,23 +160,23 @@ func (b *Exporter) handleEvent(event Event) { | |||
|
|||
switch t { | |||
case mapper.TimerTypeHistogram: | |||
histogram, err := b.registry.getHistogram(metricName, prometheusLabels, help, mapping) | |||
histogram, err := b.registry.GetHistogram(metricName, prometheusLabels, help, mapping.Buckets, mapping.Ttl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conflict warning: this will change slightly with #290 since we are about to move the Buckets and Quantiles into a HistogramOptions/SummaryOptions sub-struct. It will be easy to adapt though.
buckets = mapping.Buckets | ||
r.metricsCount.WithLabelValues("histogram").Inc() | ||
if buckets == nil || len(buckets) == 0 { | ||
buckets = r.defaults.Buckets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmmm so this is where the mapping and the expiration got very intertwined. What would the mapper need to return to make it unnecessary to handle defaults here? Would it make sense if the mapper returns a complete prometheus.HistogramOpts
(and prometheus.SummaryOpts
if timer type is summary) every time?
Conceptually, I think expiring registry needs to know about Prometheus domain objects (metric names and types, labels and values, help, summary/histogram options) and time (so that it can expire the former). It should not need to care how they came to into being. Consequently, ideally the expiringregistry
package should not have a dependency on the mapper
package.
Unfortunately we could not finish this PR in time, and it was overtaken by #298. The registry is now available at I am still interested in cleaning up the interfaces and responsibilities between the packages. I am going to close this PR since it is hopelessly merge-conflicted. If you (or anyone) has ideas how to improve the packages, please do file issues and PRs! |
This moves the expiring registry to a separate package so that it can be reused in other projects where an expiring registry might be useful. In addition, it moves telemetry out of the main package and into telemetry since it was low hanging fruit (I can remove this part if it's too much for one PR).
I hacked this up pretty quickly from an internal adaptation of the registry that only used counters and gauges and removed a lot of the defaults, so I am not sure if I got the summary and histogram logic correct when changing from passing in the mapping to a more general API, so double-checking would be greatly appreciated! Additionally, I added a mutex for the expiring registry so it can be used concurrently. The internal adaptation it is based on seems to run fine and the race detector doesn't complain, but I'm not super familiar with the code so I might have missed a lock somewhere. As used in the statsd_exporter, the lock is not necessary, which might add a slight performance hit.
Just looking for some feedback on the general approach and the API for the expiring registry. If all looks good, I'll work on improving the documentation and adding more tests for the new package.