-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
geo/geomfn: implement ST_MaxDistance / ST_DFullyWithin #49094
geo/geomfn: implement ST_MaxDistance / ST_DFullyWithin #49094
Conversation
41499eb
to
b51da62
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 7 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @otan and @sumeerbhola)
pkg/geo/geomfn/distance.go, line 55 at r2 (raw file):
// DFullyWithin determines if any part of geometry A is fully within D units of // geometry B.
Is this just the maximum of the pair-wise min distance?
The PostGIS doc comment is quite useless "Returns true if the geometries is fully within the specified distance of one another."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)
pkg/geo/geomfn/distance.go, line 55 at r2 (raw file):
Previously, sumeerbhola wrote…
Is this just the maximum of the pair-wise min distance?
The PostGIS doc comment is quite useless "Returns true if the geometries is fully within the specified distance of one another."
pair-wise distance. not quite "min distance", because intersection doesn't really count.
the better way to describe is "all points in geometry A is within D units of geometry B", but it can be thought of as MaxDistance(a,b) <= d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @otan)
pkg/geo/geomfn/distance.go, line 55 at r2 (raw file):
Can you add a code comment with a clear definition.
pair-wise distance. not quite "min distance", because intersection doesn't really count.
what does "intersection doesn't really count" mean?
the better way to describe is "all points in geometry A is within D units of geometry B", but it can be thought of as MaxDistance(a,b) <= d.
There is something about the geometric structure that allows almost the same algorithm to be used in that when doing max distance we still don't need to compare the interiors. Do you have thoughts on this and is it worth writing a code comment that justifies the algorithmic correctness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)
pkg/geo/geomfn/distance.go, line 55 at r2 (raw file):
what does "intersection doesn't really count" mean?
never mind.
There is something about the geometric structure that allows almost the same algorithm to be used in that when doing max distance we still don't need to compare the interiors. Do you have thoughts on this and is it worth writing a code comment that justifies the algorithmic correctness?
I've updated the comment to maxDistanceInternal
since its algorithm related and applies to both these commands.
b51da62
to
c41b28a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 16 files at r1, 4 of 8 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @otan and @sumeerbhola)
pkg/geo/geodist/geodist.go, line 164 at r3 (raw file):
return true } if !c.DistanceUpdater().IsMaxDistance() {
I don't see why this is correct -- I thought we wanted the minimum distance even in this case, and later do the max of the min distance between shapes. If this is being used where the point is not a shape in the original slices then it may be a different calculation.
pkg/geo/geodist/geodist.go, line 192 at r3 (raw file):
// If the exterior ring does not contain the point, we just need to calculate the distance to // the exterior ring. // Also, if we are just calculating max distance, we only want the distance to the exterior.
I don't understand this. If the point is contained in the polygon, isn't it incorrect to measure its distance from the exterior. The max distance is 0, yes?
Also, it may not be contained in the polygon, but contained in a hole, in which case the max distance will be the distance to the hole.
In short, the distance calculation where one shape is a point should be the same for max and min
pkg/geo/geodist/geodist.go, line 226 at r3 (raw file):
// Compare each vertex against the edge of the other. for _, toCheck := range []struct {
I don't see why this decomposition is correct when being called from ShapeDistance()
.
I think we want
max over all i (min over all j (distance(vertex a[i], edge b[j]))
pkg/geo/geodist/geodist.go, line 265 at r3 (raw file):
// We use the first point of the linestring for this check. // If we are looking for max distance, we only need to check against the outer ring anyway. if c.DistanceUpdater().IsMaxDistance() || !c.PointInLinearRing(a.Vertex(0), b.LinearRing(0)) {
same question as with the point and polygon comparison.
pkg/geo/geomfn/distance.go, line 55 at r2 (raw file):
Previously, otan (Oliver Tan) wrote…
what does "intersection doesn't really count" mean?
never mind.
There is something about the geometric structure that allows almost the same algorithm to be used in that when doing max distance we still don't need to compare the interiors. Do you have thoughts on this and is it worth writing a code comment that justifies the algorithmic correctness?
I've updated the comment to
maxDistanceInternal
since its algorithm related and applies to both these commands.
I still don't see why this decomposition is correct:
If A is composed of shapes A1, A2 and B is composed of shapes B1, B2
Say every point in B1 is within 1 unit of A1 and every point in B2 is within 1 unit of A2.
But every point in B1 is within 10 units of A2 and every point in B2 is within 10 units of A1.
I am assuming the max distance between A and B is 1.
But if we calculate every pair-wise distance and take the max we will get 10.
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)
pkg/geo/geodist/geodist.go, line 164 at r3 (raw file):
Previously, sumeerbhola wrote…
I don't see why this is correct -- I thought we wanted the minimum distance even in this case, and later do the max of the min distance between shapes. If this is being used where the point is not a shape in the original slices then it may be a different calculation.
Ok, I think that comment is wrong / the understanding of max distance is different between us.
We don't actually want the minimum distance - not sure how to describe it with eloquent math, but basically you want to find the distance between every point of everything in A and everything in B, and take the maximum of that.
Let's take for example max distance of the same polygon in postgis:
otan=# select ST_MaxDistance('POLYGON((0.0 0.0, 1.0 0.0, 1.0 1.0, 0.0 1.0, 0.0 0.0))', 'POLYGON((0.0 0.0, 1.0 0.0, 1.0 1.0, 0.0 1.0, 0.0 0.0))');
st_maxdistance
--------------------
1.4142135623730951
(1 row)
the maximum distance is from (0.0 0.0) to (1.0 1.0), which is sqrt(2). even if things are the same polygon, to be "fully within" they must be within sqrt(2) with each other.
even in the case of multiple polygons, this calculation is still correct. note that it is not 0, that's the minimum distance.
perhaps the test cases will help give an idea of the calculations -- note that i generated those cases by plugging the output of PostGIS into the "expected".
pkg/geo/geodist/geodist.go, line 192 at r3 (raw file):
Previously, sumeerbhola wrote…
I don't understand this. If the point is contained in the polygon, isn't it incorrect to measure its distance from the exterior. The max distance is 0, yes?
Also, it may not be contained in the polygon, but contained in a hole, in which case the max distance will be the distance to the hole.
In short, the distance calculation where one shape is a point should be the same for max and min
No, max distance is never 0 unless its just two points.
This is correct because the exterior rings are always guaranteed to be the things furthest away from each other.
pkg/geo/geodist/geodist.go, line 226 at r3 (raw file):
Previously, sumeerbhola wrote…
I don't see why this decomposition is correct when being called from
ShapeDistance()
.
I think we want
max over all i (min over all j (distance(vertex a[i], edge b[j]))
See first answer.
pkg/geo/geodist/geodist.go, line 265 at r3 (raw file):
Previously, sumeerbhola wrote…
same question as with the point and polygon comparison.
See answer above.
pkg/geo/geomfn/distance.go, line 55 at r2 (raw file):
Previously, sumeerbhola wrote…
I still don't see why this decomposition is correct:
If A is composed of shapes A1, A2 and B is composed of shapes B1, B2
Say every point in B1 is within 1 unit of A1 and every point in B2 is within 1 unit of A2.
But every point in B1 is within 10 units of A2 and every point in B2 is within 10 units of A1.
I am assuming the max distance between A and B is 1.
But if we calculate every pair-wise distance and take the max we will get 10.
Am I missing something?
See above comment.
c41b28a
to
0833277
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 16 files at r1, 1 of 7 files at r2, 1 of 8 files at r3, 4 of 4 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @otan and @sumeerbhola)
docs/generated/sql/functions.md, line 753 at r4 (raw file):
<p>This function will automatically use any available index.</p> </span></td></tr> <tr><td><a name="st_dfullywithin"></a><code>st_dfullywithin(geometry_a: geometry, geometry_b: geometry, distance: <a href="float.html">float</a>) → <a href="bool.html">bool</a></code></td><td><span class="funcdesc"><p>Returns true if all of geometry_a is in distance units of geometry_b. In other words, the max distance between geometry_a and geometry_b is less than distance units.</p>
This is open to interpretation. How about:
Returns true if every pair of points comprising geometry_a and geometry_b respectively, are within distance units.
docs/generated/sql/functions.md, line 870 at r4 (raw file):
<tr><td><a name="st_makepoint"></a><code>st_makepoint(x: <a href="float.html">float</a>, y: <a href="float.html">float</a>) → geometry</code></td><td><span class="funcdesc"><p>Returns a new Point with the given X and Y coordinates.</p> </span></td></tr> <tr><td><a name="st_maxdistance"></a><code>st_maxdistance(geometry_a: geometry, geometry_b: geometry) → <a href="float.html">float</a></code></td><td><span class="funcdesc"><p>Returns the maximum distance between the given geometries. Note if the geometries are the same, it will return the maximum distance between the geometry’s vertexes.</p>
Can we make this more specific. Something like:
Returns the maximum distance across every pair of points comprising the given geometries. Note if the ...
pkg/geo/geodist/geodist.go, line 192 at r3 (raw file):
Previously, otan (Oliver Tan) wrote…
No, max distance is never 0 unless its just two points.
This is correct because the exterior rings are always guaranteed to be the things furthest away from each other.
This is somewhat nuanced and deserves a comment. I managed to convince myself based on breaking down the cases as follows, but if you can find a reference that would be better.
// When computing the maximum distance, the cases are:
// - The point P is not contained in the exterior of the polygon G. Say vertex V is the vertex of the exterior of the polygon that is furthest away from point P (among all the exterior vertices). One can prove that any vertex of the holes will be closer to point P than vertex V. Similarly we can prove that any point in the interior of the polygin is closer to P than vertex V. Therefore we only need to compare with the exterior.
// - The point P is contained in the exterior and inside a hole of polygon G. One can again prove that the furthest point in the polygon from P is one of the vertices of the exterior.
// - The point P is contained in the polygon. One can again prove the same property.
// So we only need to compare with the exterior.
pkg/geo/geodist/geodist.go, line 156 at r4 (raw file):
// onPointToEdgesExceptFirstEdgeStart updates the distance using the edges between a point and a shape. // It will only check the ends of the edges, and assumes the check against .Edge(0).V0 is not required.
The "It will only check the ends of the edges" doesn't reflect the changed function name from the previous PR.
pkg/geo/geodist/geodist.go, line 164 at r4 (raw file):
return true } if !c.DistanceUpdater().IsMaxDistance() {
Can you add a comment before the if-block
// Max distance between a point and the set of points representing an edge is the maximum distance from the point and the pair of end-points of the edge, so we don't need update the distance using the projected point.
pkg/geo/geodist/geodist.go, line 217 at r4 (raw file):
crosser := c.NewEdgeCrosser(aEdge, b.Edge(0).V0) for bEdgeIdx := 0; bEdgeIdx < b.NumEdges(); bEdgeIdx++ { bEdge := b.Edge(bEdgeIdx)
// Max distance between 2 edges is the maximum of the distance across pairs of vertices chosen from each edge.
// It does not matter whether the edges cross, so we skip that check.
pkg/geo/geodist/geodist.go, line 264 at r4 (raw file):
// and the exterior ring. // We use the first point of the linestring for this check. // If we are looking for max distance, we only need to check against the outer ring anyway.
Let's update this comment as
// Min Distance:
// ....
// Max Distance:
// The max distance is the distance between the line string and the exterior ring of the polygon (using the same reasoning as described in onPointToPolygon()
pkg/geo/geodist/geodist.go, line 318 at r4 (raw file):
// As such, we only need to compare the exterior rings if we detect this. // // If we are only looking at the max distance, we only want to compare exteriors.
Like my comment on onPointToPolygon
this is non-obvious. Do you happen to have a reference?
pkg/geo/geomfn/distance.go, line 55 at r2 (raw file):
Previously, otan (Oliver Tan) wrote…
See above comment.
How about:
// DFullyWithin determines whether the maximum distance across every pair of points comprising geometries A and B is within D units.
pkg/geo/geomfn/distance.go, line 31 at r4 (raw file):
} // MaxDistance returns the maximum distance between geometries A and B.
How about:
// MaxDistance returns the maximum distance across every pair of points comprising geometries A and B.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)
docs/generated/sql/functions.md, line 753 at r4 (raw file):
Previously, sumeerbhola wrote…
This is open to interpretation. How about:
Returns true if every pair of points comprising geometry_a and geometry_b respectively, are within distance units.
Done.
docs/generated/sql/functions.md, line 870 at r4 (raw file):
Previously, sumeerbhola wrote…
Can we make this more specific. Something like:
Returns the maximum distance across every pair of points comprising the given geometries. Note if the ...
Done.
pkg/geo/geodist/geodist.go, line 192 at r3 (raw file):
Previously, sumeerbhola wrote…
This is somewhat nuanced and deserves a comment. I managed to convince myself based on breaking down the cases as follows, but if you can find a reference that would be better.
// When computing the maximum distance, the cases are:
// - The point P is not contained in the exterior of the polygon G. Say vertex V is the vertex of the exterior of the polygon that is furthest away from point P (among all the exterior vertices). One can prove that any vertex of the holes will be closer to point P than vertex V. Similarly we can prove that any point in the interior of the polygin is closer to P than vertex V. Therefore we only need to compare with the exterior.
// - The point P is contained in the exterior and inside a hole of polygon G. One can again prove that the furthest point in the polygon from P is one of the vertices of the exterior.
// - The point P is contained in the polygon. One can again prove the same property.
// So we only need to compare with the exterior.
I don't think it's a reference and it seems somewhat intuitive to me (I've been starting at it for a long time HAH!) -- I'll add this comment.
pkg/geo/geodist/geodist.go, line 226 at r3 (raw file):
Previously, otan (Oliver Tan) wrote…
See first answer.
Done.
pkg/geo/geodist/geodist.go, line 265 at r3 (raw file):
Previously, otan (Oliver Tan) wrote…
See answer above.
Done.
pkg/geo/geodist/geodist.go, line 156 at r4 (raw file):
Previously, sumeerbhola wrote…
The "It will only check the ends of the edges" doesn't reflect the changed function name from the previous PR.
How so? I've updated it slightly.
pkg/geo/geodist/geodist.go, line 164 at r4 (raw file):
Previously, sumeerbhola wrote…
Can you add a comment before the if-block
// Max distance between a point and the set of points representing an edge is the maximum distance from the point and the pair of end-points of the edge, so we don't need update the distance using the projected point.
Done.
pkg/geo/geodist/geodist.go, line 217 at r4 (raw file):
Previously, sumeerbhola wrote…
// Max distance between 2 edges is the maximum of the distance across pairs of vertices chosen from each edge.
// It does not matter whether the edges cross, so we skip that check.
Done.
pkg/geo/geodist/geodist.go, line 264 at r4 (raw file):
Previously, sumeerbhola wrote…
Let's update this comment as
// Min Distance:
// ....
// Max Distance:
// The max distance is the distance between the line string and the exterior ring of the polygon (using the same reasoning as described in onPointToPolygon()
Done.
pkg/geo/geodist/geodist.go, line 318 at r4 (raw file):
Previously, sumeerbhola wrote…
Like my comment on
onPointToPolygon
this is non-obvious. Do you happen to have a reference?
Nope. But your analysis is correct - I've updated the comment
pkg/geo/geomfn/distance.go, line 55 at r2 (raw file):
Previously, sumeerbhola wrote…
How about:
// DFullyWithin determines whether the maximum distance across every pair of points comprising geometries A and B is within D units.
Done.
pkg/geo/geomfn/distance.go, line 31 at r4 (raw file):
Previously, sumeerbhola wrote…
How about:
// MaxDistance returns the maximum distance across every pair of points comprising geometries A and B.
Done.
1e74871
to
7eb6238
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @otan and @sumeerbhola)
pkg/geo/geodist/geodist.go, line 192 at r3 (raw file):
Previously, otan (Oliver Tan) wrote…
I don't think it's a reference and it seems somewhat intuitive to me (I've been starting at it for a long time HAH!) -- I'll add this comment.
I've found intuition (both mine and others) to be incorrect enough times. Given the complexity of holes and concave polygons, intuition can be tripped up here. But this is fine for now.
pkg/geo/geodist/geodist.go, line 167 at r5 (raw file):
// The max distance between a point and the set of points representing an edge is the // maximum distance from the point and the pair of end-points of the edge, so we don't // need update the distance using the projected point.
... need to update ...
MaxDistance/DFullyWithin is a specialisation of distance, where it takes the maximum of distances found instead and disregards intersections and closest points. The `stopAfterLE` condition becomes `stopAfterGT`. Release note (sql change): Implemented the ST_MaxDistance and ST_DFullyWithin function for geometries.
7eb6238
to
3cf8be3
Compare
tftr! bors r=sumeerbhola |
Build succeeded |
Based on top of #49085.
MaxDistance/DFullyWithin is a specialisation of distance, where it takes
the maximum of distances found instead and disregards intersections and
closest points. The
stopAfterLE
condition becomesstopAfterGT
.Resolves #48983, resolves #48916
Release note (sql change): Implemented the ST_MaxDistance and
ST_DFullyWithin function for geometries.