-
Notifications
You must be signed in to change notification settings - Fork 31
[dual sampling] using both score and priority sampler for priority 0 traces #345
Conversation
…traces In the case of priority 0 traces, send traces to both samplers. This gives a second chance to priority 0 traces to be picked up by the score sampler. Also adds stats for -1 sampling priority traces.
26bb8b1
to
c07dd27
Compare
4f87c0f
to
48a5831
Compare
@@ -324,8 +326,11 @@ APM_CONF: | |||
if v, e := conf.GetFloat("trace.sampler", "max_traces_per_second"); e == nil { | |||
c.MaxTPS = v | |||
} | |||
if v := strings.ToLower(conf.GetDefault("trace.sampler", "priority_sampling", "")); v == "yes" || v == "true" { | |||
c.PrioritySampling = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably never worked, the default was true
, and we were overriding it to true
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useless after all ;)
agent/agent.go
Outdated
} else { | ||
samplers = []*Sampler{a.ScoreEngine} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following looks cleaner IMO (everything related to priority grouped together and ok -> hasPriority):
samplers := make([]*Sampler, 0, 2)
priority, hasPriority := root.Metrics[samplingPriorityKey]
if hasPriority && a.PriorityEngine != nil {
samplers = append(samplers, a.PriorityEngine)
if a.conf.ScorePriority0Traces && priority == 0 {
samplers = append(samplers, a.ScoreEngine)
}
} else {
samplers = append(samplers, a.ScoreEngine)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized that to be a 100% match, the a.PriorityEngine should go to a separate if
samplers := make([]*Sampler, 0, 2)
priority, hasPriority := root.Metrics[samplingPriorityKey]
if hasPriority {
if a.PriorityEngine != nil {
samplers = append(samplers, a.PriorityEngine)
}
if a.conf.ScorePriority0Traces && priority == 0 {
samplers = append(samplers, a.ScoreEngine)
}
} else {
samplers = append(samplers, a.ScoreEngine)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually none of them are equivalent to yours now that I look closer :/ Maybe there's no way around it :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I also spent some time on it initially, the "problem" is that in one case we do an if ... if ...
and in the other one an if ... else ...
. Thanks for the thorough review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As suggested in the config.go comment, let's just have both mechanisms always on, it should greatly simplify the code here.
@@ -324,8 +326,11 @@ APM_CONF: | |||
if v, e := conf.GetFloat("trace.sampler", "max_traces_per_second"); e == nil { | |||
c.MaxTPS = v | |||
} | |||
if v := strings.ToLower(conf.GetDefault("trace.sampler", "priority_sampling", "")); v == "yes" || v == "true" { | |||
c.PrioritySampling = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
agent/agent.go
Outdated
if a.conf.ScorePriority0Traces { | ||
// Send traces to possibly several score engines. | ||
if ok && a.PriorityEngine != nil { | ||
// If Priority is defined, send to priority sampling, regardless of priority value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the count of "priority < 0" traces affect the applied sample rate in the client? Some apps will routinely generate traces which the user will want to ignore via the priority flag e.g. healthchecks, debug endpoints, whatever. Should the frequency of such traces still influence the sample rate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Indeed, since the client lib can only influence the 0/-1
choice, it totally makes sense it takes decisions only on those ones. But then, I'm thinking we could also exclude the +2
traces. The problem is quite symmetrical here. In any case, I think this is really sampling logic, so should probably go to sampler package, easier to unit-test and bullet-proof rather than in this "high-level" code.
Also, I'm worried about how we can explain exactly what that -1
does. Because, it can sound like it's "exclude this from tracing". When, it's really not. The stats are still computed for instance. That's the whole point of this -1
thing I think, make sure there's no trace sampled, but still trace it. I'm saying this because we had similar issues whit the "disabled tracer" mode, which did not really disabled the tracer, but only the sending of traces to the agent.
I'm going to exclude -1
and +2
flagged traces from the count anyway.
sampled := root.Metrics[samplingPriorityKey] > 0 | ||
sampled := samplingPriority > 0 | ||
|
||
if samplingPriority < 0 || samplingPriority > 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@talwai this is where I'd exclude the traces which should have no impact on sampling ratio. I had a second thought about it and really do think the priority 2
traces should be excluded too. Not doing so would a) break the ratio for other traces (keeping less of them, when this is really about "decide to have some more") and b) give a false illusion that having 1 and 2 in the same bag does provide some fail-safe rate limiting. It does not. On a service sending 100 traces per second, with a max TPS of 10, the default target ratio would be 0.1. If we select 50% of them with priority set to 2, then we have 50 traces per sec no matter what max TPS is set to. So if we actually include priority 2 traces in the count, it will result in having only the one we selected for that service, but also possibly impact other services, and in any case, there's no capping to 10 TPS. The big picture is: priority 2 traces are something on top of the regular sampling performed by the library, it's not included in it.
config/agent.go
Outdated
PreSampleRate: 1.0, | ||
MaxTPS: 10, | ||
PrioritySampling: true, | ||
ScorePriority0Traces: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the goal is to simplify and have both samplers, I'd suggest to just drop both PrioritySampling
and ScorePriority0Traces
while always having both logics working. I see no case where these configuration should be used nor exposed to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
priorityPtr := &ts.TracesPriorityNone | ||
if a.PriorityEngine != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to check for it now, it's always defined since we removed the PrioritySampling
option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, the removal of the extra options is great and makes the code much simpler.
Combined with the clear explanation of each priority values, it was easy to review. Let's be sure now to update https://github.com/DataDog/datadog-trace-agent/wiki/Distributed-Priority-Sampling
Looking good to me.
In the case of priority 0 traces, send traces to both samplers. This gives
a second chance to priority 0 traces to be picked up by the score sampler.
Also adds stats for -1 sampling priority traces.