Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rasterize shape during geotilegrid computation #49065

Merged
merged 14 commits into from
Nov 22, 2019

Conversation

talevy
Copy link
Contributor

@talevy talevy commented Nov 14, 2019

This PR mainly modifies the existing GeoTileGridTiler to rasterize
the GeometryTree instead of iterating through all the tiles found in the
bounding box of the shape.

This PR also fixes a bug where containsFully was not being calculated correctly

relates #37206.

@talevy talevy added the :Analytics/Geo Indexing, search aggregations of geo points and shapes label Nov 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Geo)

@iverase
Copy link
Contributor

iverase commented Nov 14, 2019

Thanks @talevy! One thing I was expecting is just one method that returns the relationship between the docValue and the Extent, something like:

Relation relate(Extent extent);

Instead you are using two methods, one for within and one for intersects, is there any technical reason for doing that?

@talevy
Copy link
Contributor Author

talevy commented Nov 14, 2019

I started with this route just to make sure the logic was right since conflating both intersects and within in the same method would have made things rather complicated.

two technical reasons for splitting:

  • simpler code
  • intersects can exit early while within requires a full tree traversal.

Main reason to join the two

  • The GeoTileGridTiler#setValuesForCell code first calls within and then falls back to intersects. Combining both into a relate would make this go faster since right now there is unnecessary duplicate de-serialization and traversal.

Since the only callers of these methods are the tilers, then maybe combining makes sense. I will try it out and see how it affects performance 👍

@talevy
Copy link
Contributor Author

talevy commented Nov 14, 2019

One example of where it may be nice to have separate within/intersects is with the planned hexgrid aggregation where within is likely never to be asked:

example photo demonstrating how the sub-tiles are not fully within the parent tile: http://1fykyq3mdn5r21tpna3wkdyi-wpengine.netdna-ssl.com/wp-content/uploads/2018/06/image6-1.png

@talevy talevy force-pushed the gdv-rasterize-geotile branch from 79a64c7 to e3844f3 Compare November 14, 2019 19:41
@iverase
Copy link
Contributor

iverase commented Nov 14, 2019

This approach will not work for hex grid as a parent cell must contain all the children cells. You can have a shape that is disjoint with the parent cell but intersects a children cell.

@talevy
Copy link
Contributor Author

talevy commented Nov 14, 2019

right, there is no concept of "within" in that case. I don't bring it up in the context of rasterization, but just in the context of keeping intersect and within separate.

@talevy talevy closed this Nov 14, 2019
@talevy talevy reopened this Nov 14, 2019
@talevy talevy force-pushed the gdv-rasterize-geotile branch from c1ef005 to 1328c11 Compare November 22, 2019 01:42
@talevy talevy requested a review from iverase November 22, 2019 01:42
Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note where we can improve slightly things but not needed for this PR. I think this is an massive improvement so lgtm.

} else {
return crosses(extent);
@Override
public GeoRelation relate(Extent extent) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move the check of the extent here. Not needed for this PR but I think it is worthy to note.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. consolidating to one relating method reduces the places that there is an extent check. I will follow-up

@talevy talevy merged commit 5fed23d into elastic:geoshape-doc-values Nov 22, 2019
@talevy talevy deleted the gdv-rasterize-geotile branch November 22, 2019 15:27
talevy added a commit that referenced this pull request Nov 22, 2019
This PR mainly modifies the existing GeoTileGridTiler to rasterize
the GeometryTree instead of iterating through all the tiles found in the 
bounding box of the shape.

This PR also fixes a bug where containsFully was not being calculated correctly and simplifies all the relating logic to one `relate` method

relates #37206.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants