Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add moving percentiles pipeline aggregation #55441

Merged
merged 9 commits into from
May 12, 2020

Conversation

iverase
Copy link
Contributor

@iverase iverase commented Apr 20, 2020

Similar to what the moving function aggregation does, except merging windows of percentiles sketches together instead of cumulatively merging final metrics.

closes #49452

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments!

@nik9000, wanna give this some eyeballs too since you've been elbow-deep in pipelines lately?

if (bucketsPaths.length != 1) {
context.addBucketPathValidationError("must contain a single entry for aggregation [" + name + "]");
}
context.validateParentAggSequentiallyOrdered(NAME, name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if @nik9000 has ideas how we could validate that the pipeline targets a percentiles agg up front in the builder, instead of down below during runtime? Might not be possible yet...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be worth adding something near bucketCardinality that returns the type of buckets. That'd let us do this ordered assertion without instanceofs. I'd be happy to take a stab at it in a follow up because I have ideas. No need to delay this PR for it though.

}
}

private void reduceHDR(List<? extends InternalMultiBucketAggregation.InternalBucket> buckets,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shame there isn't a way to share this code with tdigest since it's mainly the same. Not sure it's worth trying to figure out though, unless someone has a simple/clever solution :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must say I started that way, trying to wrap the different sketches in a generic object but I gave up as it was actually much harder and at some point you need to know which method you are using.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha 👍 Yeah the more I looked at it, I came to a similar conclusion. Oh well :)

currentAggName = aggPathsList.get(0);
}

throw new AggregationExecutionException(AbstractPipelineAggregationBuilder.BUCKETS_PATH_FIELD.getPreferredName()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change this to an IllegalArgumentException, since technically it's the client's fault for pointing us at the wrong thing :)

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments about the pipeline stuff - I think this should spawn two follow up PRs, one for plumbing the parent builder to the ctor and one for better validation. I can't really speak to the percentiles stuff without doing a bunch of background reading.

@@ -96,10 +96,18 @@ public long getEstimatedMemoryFootprint() {
return state.getEstimatedFootprintInBytes();
}

DoubleHistogram getState() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you kindly add javadoc to these while you are here?

if (bucketsPaths.length != 1) {
context.addBucketPathValidationError("must contain a single entry for aggregation [" + name + "]");
}
context.validateParentAggSequentiallyOrdered(NAME, name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be worth adding something near bucketCardinality that returns the type of buckets. That'd let us do this ordered assertion without instanceofs. I'd be happy to take a stab at it in a follow up because I have ideas. No need to delay this PR for it though.

return index;
}

// TODO: replace this with the PercentilesConfig that's used by the percentiles builder.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be a fairly small change to plumb the parent builder into the ctor of the pipeline aggregator. A fine thing to do in a follow up though because it'd touch a bunch of file mechanically.

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, thought I had already finished reviewing this! LGTM, thanks for the changes :)

@iverase iverase merged commit 4e39184 into elastic:master May 12, 2020
iverase added a commit that referenced this pull request May 12, 2020
Similar to what the moving function aggregation does, except merging windows of percentiles
sketches together instead of cumulatively merging final metrics
@iverase iverase deleted the movingPercentiles branch May 12, 2020 09:35
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Jul 28, 2020
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
github-actions bot pushed a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
github-actions bot pushed a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.9.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow moving_function to access percentile sketches
7 participants