-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bytes from geogrid aggregation to request circuit-breaker #50720
Conversation
Since high precision values for geogrid aggregations can result in single documents tripping the `search.max_buckets` limit, it is important to catch these as soon as possible. This PR pushes a new check to the MultiBucketConsumerService to allow stateless checking of new bucket counts against the limit.
Pinging @elastic/es-analytics-geo (:Analytics/Geo) |
run elasticsearch-ci/1 |
hey @polyfractal. I'd like to get some more explicit tests to verify the pre and post collection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This turned out pretty clean! Nice work!
++ if we could get some kind of test for the pre-segment and post-collection hooks.
Do you know if there are any integration tests that use the geotile/geogrid agg? Not generally a fan of integration tests but since our breaker testing is spotty right now (e.g. all the None breakers that AggTestCase uses) it might make sense to ensure there is at least one integration test flexing the breaker. Not necessarily trying to trip it, just using it for routine tests.
|
||
public void testGeoShapeTrippedCircuitBreaker() throws IOException { | ||
GeoGridTiler tiler = geoGridTiler(); | ||
int precision = randomIntBetween(1, 9); // does not go until MAX_ZOOM for performance reasons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just verifying this won't nuke CI, since it has to run at least once to determine the number tiles? I'm assuming that's why it's limited to precision: 9 but thought I'd double check :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it is just that I am not interested in tiling a shape into 100s of thousands of tiles. I've run this test over 1000 iterations and things seem to run snappily
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
int count = GEOHASH.setValues(values, value, 7); | ||
assertThat(count, equalTo(1024)); | ||
} | ||
} | ||
|
||
public void testGeoHashMaxBuckets() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename these tests to reference the breaker instead of max buckets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, good catch!
Closed in #37206 |
Since high precision values for geogrid aggregations can
result in single documents tripping the bytes limit for a request.
large shapes and high precisions can result in hundreds of thousands
of buckets kept track in a
long[]
. To protect from OOMs, it is importantto keep track of this memory usage. This array is re-used across documents
within each segment, so accounting is managed by the GeoGridAggregator.
relates #37206