Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to mask counters to 53 bits? #350

Closed
candlerb opened this issue Oct 31, 2018 · 7 comments
Closed

Option to mask counters to 53 bits? #350

candlerb opened this issue Oct 31, 2018 · 7 comments

Comments

@candlerb
Copy link
Contributor

Prometheus stores float64 (double) values, which have a 53-bit mantissa.

For counters which exceed 253, this means there is a loss of precision when taking the difference between adjacent samples to calculate rates.

A link running at 100Gbps could reach 253 bytes in 8.3 days. After this, the accuracy will degrade, eventually to the point where counters are rounded to the nearest 211 bytes.

This is arguably a minor issue - but it could be avoided entirely if there were an option to truncate SNMP counter values to 53 bits by ANDing them with 0x001f_ffff_ffff_ffff. Such counters would then wrap more frequently, but maintain full accuracy.

Such masking must not be applied to gauges of course, so it would have to be a per-OID option.

@SuperQ
Copy link
Member

SuperQ commented Oct 31, 2018

Thanks for opening this issue. I've talked with people about doing this, I think it's a good idea. The simple method would be to do value % 2^53.

The question is, should we do this automatically for counters? Should it be a configuration option for the generator? Should it be per-OID or per module?

I could imagine something like this:

overrides:
  ifHCInOctets:
    wrap_large_values: true
  ifHCOutOctets:
    wrap_large_values: true

@SuperQ
Copy link
Member

SuperQ commented Oct 31, 2018

Another question, for users that care about the current value of the counter, how/if do we expose the number of wraps applied to the value? Do we have a metric like ifHCInOctets_wraps{...} 0

@candlerb
Copy link
Contributor Author

If the data is tagged on the wire as Counter64 (i.e. [APPLICATION 6]) then masking such values automatically would be good enough for me. Maybe have an option to disable masking (globally? per OID?) if someone wants otherwise.

I personally don't care about the number of wraps. I'd be happy to treat the counter as if it were natively 53 bits.

Maybe when we get to interfaces of 1Tbps+ then it will be an issue again.

Having the option to change the masking resolution to some intermediate value, say 56 bits or 60 bits, would allow a trade-off between resolution and wrapability.

@RichiH
Copy link
Member

RichiH commented Oct 31, 2018 via email

@SuperQ
Copy link
Member

SuperQ commented Oct 31, 2018

FYI, the simplest version of this patch is this:

diff --git a/collector.go b/collector.go
index 824931b..3bf94a0 100644
--- a/collector.go
+++ b/collector.go
@@ -252,7 +252,7 @@ PduLoop:
 func getPduValue(pdu *gosnmp.SnmpPDU) float64 {
        switch pdu.Type {
        case gosnmp.Counter64:
-               return float64(gosnmp.ToBigInt(pdu.Value).Uint64())
+               return float64(gosnmp.ToBigInt(pdu.Value).Uint64() % 2^53)
        case gosnmp.OpaqueFloat:
                return float64(pdu.Value.(float32))
        case gosnmp.OpaqueDouble:

@RichiH
Copy link
Member

RichiH commented Oct 31, 2018

LGTM.

@RichiH
Copy link
Member

RichiH commented Oct 31, 2018

Maybe with one or two sentences in the docs to make this obvious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants