-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client side aggregation #134
Conversation
… of returning false.
…:DataDog/dogstatsd-csharp-client into olivierg/client-side-aggregation # Conflicts: # tests/StatsdClient.Tests/StatsdBuilderTests.cs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I have a question though, I believe this approach requires incoming metrics to flush (ie. we evaluate if we should flush when we aggregate and check the time elapsed), if the in-flow of metrics stops we would not flush what had been aggregated until that point. I believe with both the go and java client we have a timer task that flushes after a certain period of time. Otherwise, just some minor questions. Nice! 👌
{ | ||
if (force | ||
|| _stopWatch.ElapsedMilliseconds > _flushIntervalMilliseconds | ||
|| _values.Count >= _maxUniqueStatsBeforeFlush) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? I presume the goal here is to curb memory usage. We didn't implement anything like this in the java client, but maybe we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very easy to implement and limit the memory usage so I think it worth implementing it.
// This code was auto generated | ||
public override int GetHashCode() | ||
{ | ||
int hashCode = -335110880; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it makes sense here, I have to review the rest of the PR (😊), but I'm wondering if it makes sense to cache the hashCode the way we've done in the java dogstatsd client in case this is called multiple times, to avoid multiple hashing operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should be call once by object. I ran a profiler and this function appeared in the hot path and so I keep an eye on it!
public SetAggregator(MetricAggregatorParameters parameters, Telemetry optionalTelemetry) | ||
{ | ||
_aggregator = new AggregatorFlusher<StatsMetricSet>(parameters, MetricType.Set); | ||
_pool = new Pool<StatsMetricSet>(pool => new StatsMetricSet(pool), 2 * parameters.MaxUniqueStatsBeforeFlush); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why pre-allocate sets and not the other metric types (this is a rarer metric when compared to counters and gauges)? That said, it's a good performance decision, but I'd like to understand a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CountAggregator
does not allocate memory as it sum integers.
SetAggregator
allocates a HashSet<string>
by metric. I tried to avoid memory allocations. As I already have a Pool
it does not introduce too much complexity. What do you think?
|
Add the client side aggregation to improve the performance:
Count
: sum the value for same metric names and tagsGauge
: Keep the last value for same metric names and tagsSet
: Keep unique value for for same metric names and tags