-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rate Limiting sampler for .NET #1996
Rate Limiting sampler for .NET #1996
Conversation
redo of #1967 |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1996 +/- ##
===========================================
+ Coverage 73.91% 88.57% +14.65%
===========================================
Files 267 10 -257
Lines 9615 175 -9440
===========================================
- Hits 7107 155 -6952
+ Misses 2508 20 -2488
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/// </returns> | ||
public override SamplingResult ShouldSample(in SamplingParameters samplingParameters) | ||
{ | ||
return this.rateLimiter.TrySpend(1.0) ? this.onSamplingResult : this.offSamplingResult; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow the algorithm.. Why pass 1.0 always here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying algorithm in the RateLimiter has an item cost for the relative cost of each item. We treat traces as all having the same cost, hence 1.0 (as the rate limiter takes doubles).
The algorithm is to essentially convert the cost into ticks (cost/max_samples_sec) and then see if there is a balance available to spend. If so the value is deducted and the decision is to keep that trace, if not it rejects it and the balance continues to grow.
The max_balance is to account for bursty traffic. If you have 3s of no traffic, the balance would build up, it gets capped at the allowance for 1s worth of items.
/// <param name="maxTracesPerSecond">The maximum number of traces that will be emitted each second.</param> | ||
public RateLimitingSampler(int maxTracesPerSecond) | ||
{ | ||
double maxBalance = maxTracesPerSecond < 1.0 ? 1.0 : maxTracesPerSecond; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxTracesPerSecond < 1.0 -- when would this happen ever?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the input arg is an int, it's possible someone could pass in a value < 1. It's not gonna work though, so defaulting to 1 makes sense here.
In this scenario, I think we should log a warning to indicate the passed in value has been overriden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see this is a direct copy of what's in Java and the rudimentary tests show it to work (on windows), but I don't understand the logic to know how this is meant to work.
Also, we'll need to figure out why the tests are failing on ubuntu.
/// <param name="maxTracesPerSecond">The maximum number of traces that will be emitted each second.</param> | ||
public RateLimitingSampler(int maxTracesPerSecond) | ||
{ | ||
double maxBalance = maxTracesPerSecond < 1.0 ? 1.0 : maxTracesPerSecond; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the input arg is an int, it's possible someone could pass in a value < 1. It's not gonna work though, so defaulting to 1 makes sense here.
In this scenario, I think we should log a warning to indicate the passed in value has been overriden.
first N will always be sampled in. Accounting for that makes the results exactly match what was expected, but keeping a small fudge factor just incase, because tests that rely on timing will always fail at the most inopportune moments.
I did some more experimentation, including running in WSL (Ubuntu). I had not accounted for the initial balance in the math in the test - with that, the first N asks will be sampled in as they chew through that balance. With this accounted for the expected value and actual are exact in my personal experiments, but I want to keep a small fudge factor as tests that rely on timing don't always work out the way you expect them to. |
@cijothomas the ubuntu run should pass tests now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for fixing linux tests 👍🏻
@cijothomas - poke 😸 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - left some non-blocking comments.
Co-authored-by: Cijo Thomas <[email protected]>
Port of https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/jaeger-remote-sampler/src/main/java/io/opentelemetry/sdk/extension/trace/jaeger/sampler/RateLimitingSampler.java to .NET
The rate limiting sampler is a sampler that will limit the number of traces to
the specified rate per second. It is typically used in conjunction with the ParentBasedSampler
to ensure that the rate limiting sampler is only applied to the root spans. If
a request comes in without a sampling decision, the rate limiting sampler will
make a decision based on the rate limit. The sampling decision is stored in the
activity context, via the Recorded property, and is passed along with calls to
other services called with HttpClient so the same decision is used for the entire
trace.
Example of RateLimitingSampler usage:
Changes
Please provide a brief description of the changes here.
Merge requirement checklist
CHANGELOG.md
files updated for non-trivial changes