Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Construction of XYZ points with NaN coordinates #881

Merged
merged 3 commits into from
Apr 29, 2023

Conversation

caspervdw
Copy link
Contributor

While making a more strict approach to NaN coordinate handling in shapely, I noticed that the NaN handling (and coordseq dimensionality) has improved a great deal in GEOS (which is great!)

However I bumped into an inconsistency in the interpretation of XYZ points with NaN coordinates. When constructing a point from NaN, NaN, you get an empty point. I personally think that is good because it is consistent with the limitations of WKB. I guess the rule is "a point with an all-NaN coordinate is an empty point".

However, the logic in GEOS considers only X and Y at two points yielding surprising behaviour on 3D points:

>>> Point(nan, nan nan)
<POINT Z EMPTY>
>>> Point(nan, 1, nan)
<POINT Z (NaN 1 NaN)
>>> Point(nan, nan, 1)
<POINT Z EMPTY>

This PR addresses that issue by constructing an empty point if its first coordinate is all-NaN.
This allowed me to cleanup some logic in the Point::isEmpty and in the WKTWriter.

Ref. shapely/shapely#1811

@pramsey
Copy link
Member

pramsey commented Apr 28, 2023

Not that I approve of this kind of thing, but doesn't this leave open for me the possibility of hand-constructing a point, setting its X and Y to nan, and then writing it out to WKB, and then reading it in and finding my carefully constructed nan coordinates are gone and I have an empty coordinate list instead?

Also, it seems a little odd that the signal for an empty point is a ... non empty coordinate sequence? This passes all the extant tests, which I think means I'm fine with it. What happens if I construct a point with a zero-length coordinate sequence?

@caspervdw
Copy link
Contributor Author

Not that I approve of this kind of thing, but doesn't this leave open for me the possibility of hand-constructing a point, setting its X and Y to nan, and then writing it out to WKB, and then reading it in and finding my carefully constructed nan coordinates are gone and I have an empty coordinate list instead?

This is currently already the case. This PR does improve this situation slightly because the carefully constructed (invalid) point will be coerced immediately to an empty point. Point(NaN NaN) simply can't exist because it is encoding for POINT EMPTY in WKB.

What happens if I construct a point with a zero-length coordinate sequence?

Precisely the same as constructing a point from a sequence with an all-nan coordinate.

This is as far as I know the only exception for nonfinite values in geometries; for the rest, any geometry could potentially have a NaN/Infinity/-Infinity value and still be represented as WKB/WKT.

Besides, I'd rather have all the nonfinite values in coordinate sequences forbidden right away. I personally think using them is always a bad idea. In shapely we now are making 3 "handle_nan" modes: 1) allow, 2) skip, 3) error. The second and the third will prevent NaNs (and +-Infinity) in all cases. See shapely/shapely#1573.

@caspervdw
Copy link
Contributor Author

Oh @pramsey , did you mean that Point is mutable so it can be changed to NaN NaN? Didn't think of that...

@pramsey
Copy link
Member

pramsey commented Apr 29, 2023

Just wanted to make sure you'd poked this a number of ways. There is a huge combinatorial explosion of ways to be empty or invalid as the nonfinite numbers start being added into the mix.

@pramsey pramsey merged commit d054731 into libgeos:main Apr 29, 2023
@dbaston
Copy link
Member

dbaston commented Apr 30, 2023

It's not clear to me why we would want change the internal representation of an empty point more like WKB. Storing a point as an empty coordinate sequence seems simple enough, and if there are errors with WKX input/output, we can handle that with readers/writers rather than using this internal representation?

@caspervdw
Copy link
Contributor Author

@dbaston to be clear, the situation after this PR is that an empty point is stored as an empty coordinate sequence. The thing that I added is an automatic coercion of all-Nan points to an empty coordinate sequence. This allowed me to remove some patches in the Point::isEmpty and WKT writer.

However, I think it isn’t complete yet as @pramsey pointed out that an all-Nan point could still be constructed by adjusting the coord seq after constructing the point? Or is that not the case?

And on a side note, has there been a discussion about blocking nonfinite values altogether?

@dr-jts
Copy link
Contributor

dr-jts commented Apr 30, 2023

And on a side note, has there been a discussion about blocking nonfinite values altogether?

Yes, several times (probably on some issue, but don't have a reference to hand).

The conclusion is always to follow Postel's Law: Be lenient in what you accept, and strict in what you emit.

The rationale is that the library should be able to read everything, so that it can be validated and subsequently fixed as and how determined.

Geometries containing non-finite ordinates are always considered invalid, via this IsvalidOp logic. (Incidentally, it would be better for this logic to be in Coordinate, to centralize it and make it more visible. Also, shouldn't that method take a reference, since the pointer should never be null? -> @dbaston )

As Paul, says, there is a combinatorial explosion of possibilities if the lib tries to do clever things with handling non-finite values. It would be best to take a rigorous and simple approach to determining how to decide if things are EMPTY internally. It might be simplest to specify that any non-finite X or Y ordinate in the first coordinate indicates an EMPTY CoordinateSequence. That way most geometry operations will Just Work, since they generally always check for EMPTY and take some default action.

@@ -281,7 +281,7 @@ WKTWriter::appendPointTaggedText(const Point& point, OrdinateSet outputOrdinates
appendOrdinateText(outputOrdinates, writer);

const CoordinateXY* coord = point.getCoordinate();
if (coord == nullptr || (std::isnan(coord->x) && std::isnan(coord->y))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not leave this logic in place? Belt-and-suspenders...

But it should be changed to use CoordinateXY.isNull, to centralize this behaviour. (And maybe that should be changed to use OR rather than AND, to be more inclusive).

@dr-jts dr-jts added the Enhancement New feature or feature improvement. label May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or feature improvement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants