Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geo/geomfn: implement DWithin for geometry types #49085

Merged
merged 1 commit into from
May 18, 2020

Conversation

otan
Copy link
Contributor

@otan otan commented May 14, 2020

Moved the core of the logic for distance calculation operations to the
geodist package, with required methods interface-d into the package
for geometry/geography calculations. The code is mostly the same as it
was in geogfn/distance.go, just the spheroid/sphere calculations mvoed
out.

The geometry based operations are contained in geomfn/distance.go. All
math is annotated with links as well.

Changed MinDistance to use the self-made implementation. The tests still
call GEOS to make sure the return values are similar.

Needed to bring some vendored packages in from go-geom for this to work.

Resolves #48923.

Release note (sql change): Implemented DWithin for Geometry types.

@otan otan requested review from sumeerbhola and a team May 14, 2020 21:21
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@otan otan force-pushed the dwithin_geom branch 4 times, most recently from f820d5c to 02c168c Compare May 15, 2020 00:23
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 12 of 16 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan and @sumeerbhola)


docs/generated/sql/functions.md, line 748 at r1 (raw file):

</span></td></tr>
<tr><td><a name="st_distance"></a><code>st_distance(geometry_a: geometry, geometry_b: geometry) &rarr; <a href="float.html">float</a></code></td><td><span class="funcdesc"><p>Returns the distance between the given geometries.</p>
<p>This function utilizes the GEOS module.</p>

not using GEOS for distance?
I was expecting we would use geos.MinDistance to implement DWithin too -- why the change?


pkg/geo/geodist/geodist.go, line 75 at r1 (raw file):

type DistanceUpdater interface {
	// Update updates the distance based on two provided points,
	// returning if the function should return early.

I understand that hooks are needed for distance calculation, but why for keeping track of the current distance?
And Update() looks like a distance calculation hook in an interface that is keeping track of the current distance.
Are we tracking what was done for the Geography code too closely?


pkg/geo/geodist/geodist.go, line 83 at r1 (raw file):

}

// Crosser is a provided hook that has a series of functions that can help

there aren't a "series of functions". And maybe this can be renamed EdgeCrosser.


pkg/geo/geodist/geodist.go, line 87 at r1 (raw file):

type Crosser interface {
	// ChainCrosser compares the previously evaluated point and the current point
	// and checks whether it crosses another given line.

This comment is not very clear.
Is the EdgeCrosser handling crossing for an edge it is initialized with?
Is it crossing between that edge and the edge defined by the (p1, p2) where the calls were ChainCrossing(p1), ChainCrossing(p2)?
Does the first call to ChainCrossing() always return false?


pkg/geo/geodist/geodist.go, line 94 at r1 (raw file):

// the distance between two shapes.
type DistanceCalculator interface {
	// DistanceUpdater returns the DistanceUpdater for the given point.

why does the calculator have an updater? Is is possible to keep the geo math separate?
What "given point"?


pkg/geo/geodist/geodist.go, line 96 at r1 (raw file):

	// DistanceUpdater returns the DistanceUpdater for the given point.
	DistanceUpdater() DistanceUpdater
	// NewCrosser returns a new Crosser.

this needs a longer comment.
If we call NewCrosser(e, p1) and then ChainCrossing(p2), is it comparing e with (p1, p2)?


pkg/geo/geodist/geodist.go, line 104 at r1 (raw file):

	ClosestPointToEdge(edge Edge, point Point) (Point, bool)
}

I haven't looked at the rest of this file too closely, since it seems to follow the pattern for geography, before this refactoring.


pkg/geo/geogfn/distance.go, line 259 at r1 (raw file):

	b := bInterface.(*s2GeodistPoint).Point

	// Calculate the sphere distance first, as it is much cheaper.

This "Calculate the sphere distance first" seems stale -- it looks like this is meant to only be the sphere distance and that it is Distance() function that does the spheroid calculation.

As mentioned in an earlier comment, calculating the distance between points seems out of place in an updater interface. Also, s1.ChordAngle is a float64, so it does seem that we can have a general updater that keeps the min distance as a float64 and the final call to speroidDistance() can happen at the caller? Would that be too awkward?


pkg/geo/geomfn/distance.go, line 47 at r1 (raw file):

// minDistanceInternal finds the minimum distance between two geometries.
// This implementation is done in-house, as compared to using GEOS.

is this a speed issue? Do we have a benchmark?
Aren't we opening ourselves to more compatibility issues with PostGIS by not using GEOS.


pkg/geo/geomfn/distance.go, line 303 at r1 (raw file):

// This includes the boundary.
// It only checks whether X _or_ Y is within the range of values covered by the edge.
func (c *geomDistanceCalculator) inBoundingCross(

this is a semantically confusing predicate. What does it really mean in geometric terms?


pkg/geo/geomfn/distance.go, line 306 at r1 (raw file):

	p geom.Coord, eV0 geom.Coord, eV1 geom.Coord,
) bool {
	return (eV0.X() <= p.X() && p.X() < eV1.X()) || (eV1.X() <= p.X() && p.X() < eV0.X()) ||

The X coordinates are considered [start, end), and same for Y?

These 4 functions in this file seem to be the new computations introduced in this PR -- I made a first pass but will take another close look.

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


docs/generated/sql/functions.md, line 748 at r1 (raw file):

Previously, sumeerbhola wrote…

not using GEOS for distance?
I was expecting we would use geos.MinDistance to implement DWithin too -- why the change?

a) PostGIS doesn't use GEOS
b) I wanted the distance / dwithin functions to match each other
c) avoid CGO
d) GEOS gives answers that are a little off due to precision compared to doing it ourselves (and doing it ourselves matches PostGIS results more closely. one result was 0.9999996 in GEOS but 0.9999998 ourselves/in PostGIS, for example).


pkg/geo/geodist/geodist.go, line 75 at r1 (raw file):

Previously, sumeerbhola wrote…

I understand that hooks are needed for distance calculation, but why for keeping track of the current distance?
And Update() looks like a distance calculation hook in an interface that is keeping track of the current distance.
Are we tracking what was done for the Geography code too closely?

Well, this is really useful for doing MaxDistance as well. I'd rather keep this in, unless you had another suggestion.


pkg/geo/geodist/geodist.go, line 83 at r1 (raw file):

Previously, sumeerbhola wrote…

there aren't a "series of functions". And maybe this can be renamed EdgeCrosser.

Done.


pkg/geo/geodist/geodist.go, line 87 at r1 (raw file):

Previously, sumeerbhola wrote…

This comment is not very clear.
Is the EdgeCrosser handling crossing for an edge it is initialized with?
Is it crossing between that edge and the edge defined by the (p1, p2) where the calls were ChainCrossing(p1), ChainCrossing(p2)?
Does the first call to ChainCrossing() always return false?

Done. Clarified the comments are bit.


pkg/geo/geodist/geodist.go, line 94 at r1 (raw file):

why does the calculator have an updater?
less interfaces to pass around...

Is is possible to keep the geo math separate?
How do you mean?

What "given point"?
Clarified?


pkg/geo/geodist/geodist.go, line 96 at r1 (raw file):

Previously, sumeerbhola wrote…

this needs a longer comment.
If we call NewCrosser(e, p1) and then ChainCrossing(p2), is it comparing e with (p1, p2)?

Clarified.


pkg/geo/geodist/geodist.go, line 104 at r1 (raw file):

Previously, sumeerbhola wrote…

I haven't looked at the rest of this file too closely, since it seems to follow the pattern for geography, before this refactoring.

Yep. Crosser is the same as s2.NewChainEdgeCrosser, if that helps


pkg/geo/geogfn/distance.go, line 259 at r1 (raw file):

This "Calculate the sphere distance first" seems stale -- it looks like this is meant to only be the sphere distance and that it is Distance() function that does the spheroid calculation.

yep, relic statement.

As mentioned in an earlier comment, calculating the distance between points seems out of place in an updater interface. Also, s1.ChordAngle is a float64, so it does seem that we can have a general updater that keeps the min distance as a float64 and the final call to speroidDistance() can happen at the caller? Would that be too awkward?

can you give an example of what you want it to look like? i wanted to avoid having to pass two interfaces everywhere, and update technically does update the value by computing the distance between the two points. i can merge the updater and calculator interfaces if that makes it clearer.


pkg/geo/geomfn/distance.go, line 47 at r1 (raw file):

is this a speed issue? Do we have a benchmark?
we don't know (i'd rather get working code as a base), and no.

Aren't we opening ourselves to more compatibility issues with PostGIS by not using GEOS.
see comment above.


pkg/geo/geomfn/distance.go, line 303 at r1 (raw file):

Previously, sumeerbhola wrote…

this is a semantically confusing predicate. What does it really mean in geometric terms?

i've merged it with the place it is used to be clearer.


pkg/geo/geomfn/distance.go, line 306 at r1 (raw file):

Previously, sumeerbhola wrote…

The X coordinates are considered [start, end), and same for Y?

These 4 functions in this file seem to be the new computations introduced in this PR -- I made a first pass but will take another close look.

Clarified why we do it this way.

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/geo/geogfn/distance.go, line 259 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

This "Calculate the sphere distance first" seems stale -- it looks like this is meant to only be the sphere distance and that it is Distance() function that does the spheroid calculation.

yep, relic statement.

As mentioned in an earlier comment, calculating the distance between points seems out of place in an updater interface. Also, s1.ChordAngle is a float64, so it does seem that we can have a general updater that keeps the min distance as a float64 and the final call to speroidDistance() can happen at the caller? Would that be too awkward?

can you give an example of what you want it to look like? i wanted to avoid having to pass two interfaces everywhere, and update technically does update the value by computing the distance between the two points. i can merge the updater and calculator interfaces if that makes it clearer.

(tried merging the updater and calculator interfaces and it's a bit messier than desired. the spheroid calculator stuff and the max distance makes it hard to make it separate as you may want it)

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 16 files at r1, 3 of 3 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan and @sumeerbhola)


pkg/geo/geodist/geodist.go, line 75 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

Well, this is really useful for doing MaxDistance as well. I'd rather keep this in, unless you had another suggestion.

Ack -- I noticed the MaxDistance in the other PR after writing the comment.


pkg/geo/geodist/geodist.go, line 94 at r1 (raw file):

why does the calculator have an updater?

less interfaces to pass around...

That is reasonable


pkg/geo/geodist/geodist.go, line 83 at r2 (raw file):

}

// EdgeCrosser is a provided hook that calculates whether edges inters.

intersect.


pkg/geo/geodist/geodist.go, line 154 at r2 (raw file):

// onPointToEdgeEnds updates the distance using the edges between a point and a shape.
// It will only check the ends of the edges, and assumes the check against .Edge(0).V0 is not required.

This is more like onPointToEdgesExceptFirstEdgeStart(...)

Since it doesn't only check against the ends -- it checks against the interior of the edge.


pkg/geo/geodist/geodist.go, line 187 at r2 (raw file):

func onPointToPolygon(c DistanceCalculator, a Point, b Polygon) bool {
	// If the exterior ring does not cover the outer ring, we just need to calculate the distance
	// to the outer ring.

did you mean
If the exterior ring does not contain the point, we just need to calculate the distance to the exterior ring.


pkg/geo/geodist/geodist.go, line 269 at r2 (raw file):

	// * If polyline A is within the given distance, we can immediately return.
	// * If polyline A does not intersect the hole but there is at least one point inside
	//   the hole, we can immediately return since all points must be inside that hole

... must be inside that hole and so the distance of this polyline to this hole is the minimum distance of this polyline to this polygon (since ...


pkg/geo/geogfn/distance.go, line 259 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

(tried merging the updater and calculator interfaces and it's a bit messier than desired. the spheroid calculator stuff and the max distance makes it hard to make it separate as you may want it)

I reread and this looks fine now.


pkg/geo/geomfn/distance.go, line 339 at r2 (raw file):

	// See also: https://en.wikipedia.org/wiki/Winding_number
	// See also: https://en.wikipedia.org/wiki/Nonzero-rule
	windingNumber := 0

The steps to go from the description in those links to the following algorithm are non-obvious.
I found http://geomalgorithms.com/a03-_inclusion.html to be useful
Any thoughts on how the two relate?

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


docs/generated/sql/functions.md, line 748 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

a) PostGIS doesn't use GEOS
b) I wanted the distance / dwithin functions to match each other
c) avoid CGO
d) GEOS gives answers that are a little off due to precision compared to doing it ourselves (and doing it ourselves matches PostGIS results more closely. one result was 0.9999996 in GEOS but 0.9999998 ourselves/in PostGIS, for example).

Done.


pkg/geo/geodist/geodist.go, line 75 at r1 (raw file):

Previously, sumeerbhola wrote…

Ack -- I noticed the MaxDistance in the other PR after writing the comment.

Done.


pkg/geo/geodist/geodist.go, line 94 at r1 (raw file):

Previously, sumeerbhola wrote…

why does the calculator have an updater?

less interfaces to pass around...

That is reasonable

Done.


pkg/geo/geodist/geodist.go, line 96 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

Clarified.

Done.


pkg/geo/geodist/geodist.go, line 104 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

Yep. Crosser is the same as s2.NewChainEdgeCrosser, if that helps

Done.


pkg/geo/geodist/geodist.go, line 83 at r2 (raw file):

Previously, sumeerbhola wrote…

intersect.

Done.


pkg/geo/geodist/geodist.go, line 154 at r2 (raw file):

Previously, sumeerbhola wrote…

This is more like onPointToEdgesExceptFirstEdgeStart(...)

Since it doesn't only check against the ends -- it checks against the interior of the edge.

Done.


pkg/geo/geodist/geodist.go, line 187 at r2 (raw file):

Previously, sumeerbhola wrote…

did you mean
If the exterior ring does not contain the point, we just need to calculate the distance to the exterior ring.

Yep. fixed.


pkg/geo/geodist/geodist.go, line 269 at r2 (raw file):

Previously, sumeerbhola wrote…

... must be inside that hole and so the distance of this polyline to this hole is the minimum distance of this polyline to this polygon (since ...

Done.


pkg/geo/geogfn/distance.go, line 259 at r1 (raw file):

Previously, sumeerbhola wrote…

I reread and this looks fine now.

Done.


pkg/geo/geomfn/distance.go, line 339 at r2 (raw file):

Previously, sumeerbhola wrote…

The steps to go from the description in those links to the following algorithm are non-obvious.
I found http://geomalgorithms.com/a03-_inclusion.html to be useful
Any thoughts on how the two relate?

Added!

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)

Moved the core of the logic for distance calculation operations to the
`geodist` package, with required methods interface-d into the package
for geometry/geography calculations. The code is mostly the same as it
was in geogfn/distance.go, just the spheroid/sphere calculations mvoed
out.

The geometry based operations are contained in geomfn/distance.go. All
math is annotated with links as well.

Changed MinDistance to use the self-made implementation. The tests still
call GEOS to make sure the return values are similar.

Needed to bring some vendored packages in from go-geom for this to work.

Release note (sql change): Implemented DWithin for Geometry types.
@otan
Copy link
Contributor Author

otan commented May 18, 2020

bors r=sumeerbhola

@craig
Copy link
Contributor

craig bot commented May 18, 2020

Build succeeded

@craig craig bot merged commit 17fffd4 into cockroachdb:master May 18, 2020
craig bot pushed a commit that referenced this pull request May 21, 2020
49094:  geo/geomfn: implement ST_MaxDistance / ST_DFullyWithin r=sumeerbhola a=otan

Based on top of #49085.

MaxDistance/DFullyWithin is a specialisation of distance, where it takes
the maximum of distances found instead and disregards intersections and
closest points. The `stopAfterLE` condition becomes `stopAfterGT`.

Resolves #48983, resolves #48916

Release note (sql change): Implemented the ST_MaxDistance and
ST_DFullyWithin function for geometries.



Co-authored-by: Oliver Tan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

geo/geomfn: implement ST_DWithin({geometry,geometry,float8})
3 participants