Construction of XYZ points with NaN coordinates #881

caspervdw · 2023-04-28T18:28:59Z

While making a more strict approach to NaN coordinate handling in shapely, I noticed that the NaN handling (and coordseq dimensionality) has improved a great deal in GEOS (which is great!)

However I bumped into an inconsistency in the interpretation of XYZ points with NaN coordinates. When constructing a point from NaN, NaN, you get an empty point. I personally think that is good because it is consistent with the limitations of WKB. I guess the rule is "a point with an all-NaN coordinate is an empty point".

However, the logic in GEOS considers only X and Y at two points yielding surprising behaviour on 3D points:

>>> Point(nan, nan nan)
<POINT Z EMPTY>
>>> Point(nan, 1, nan)
<POINT Z (NaN 1 NaN)
>>> Point(nan, nan, 1)
<POINT Z EMPTY>

This PR addresses that issue by constructing an empty point if its first coordinate is all-NaN.
This allowed me to cleanup some logic in the Point::isEmpty and in the WKTWriter.

Ref. shapely/shapely#1811

pramsey · 2023-04-28T19:09:27Z

Not that I approve of this kind of thing, but doesn't this leave open for me the possibility of hand-constructing a point, setting its X and Y to nan, and then writing it out to WKB, and then reading it in and finding my carefully constructed nan coordinates are gone and I have an empty coordinate list instead?

Also, it seems a little odd that the signal for an empty point is a ... non empty coordinate sequence? This passes all the extant tests, which I think means I'm fine with it. What happens if I construct a point with a zero-length coordinate sequence?

caspervdw · 2023-04-29T14:56:23Z

Not that I approve of this kind of thing, but doesn't this leave open for me the possibility of hand-constructing a point, setting its X and Y to nan, and then writing it out to WKB, and then reading it in and finding my carefully constructed nan coordinates are gone and I have an empty coordinate list instead?

This is currently already the case. This PR does improve this situation slightly because the carefully constructed (invalid) point will be coerced immediately to an empty point. Point(NaN NaN) simply can't exist because it is encoding for POINT EMPTY in WKB.

What happens if I construct a point with a zero-length coordinate sequence?

Precisely the same as constructing a point from a sequence with an all-nan coordinate.

This is as far as I know the only exception for nonfinite values in geometries; for the rest, any geometry could potentially have a NaN/Infinity/-Infinity value and still be represented as WKB/WKT.

Besides, I'd rather have all the nonfinite values in coordinate sequences forbidden right away. I personally think using them is always a bad idea. In shapely we now are making 3 "handle_nan" modes: 1) allow, 2) skip, 3) error. The second and the third will prevent NaNs (and +-Infinity) in all cases. See shapely/shapely#1573.

caspervdw · 2023-04-29T14:58:48Z

Oh @pramsey , did you mean that Point is mutable so it can be changed to NaN NaN? Didn't think of that...

pramsey · 2023-04-29T22:50:27Z

Just wanted to make sure you'd poked this a number of ways. There is a huge combinatorial explosion of ways to be empty or invalid as the nonfinite numbers start being added into the mix.

dbaston · 2023-04-30T00:03:08Z

It's not clear to me why we would want change the internal representation of an empty point more like WKB. Storing a point as an empty coordinate sequence seems simple enough, and if there are errors with WKX input/output, we can handle that with readers/writers rather than using this internal representation?

caspervdw · 2023-04-30T10:47:11Z

@dbaston to be clear, the situation after this PR is that an empty point is stored as an empty coordinate sequence. The thing that I added is an automatic coercion of all-Nan points to an empty coordinate sequence. This allowed me to remove some patches in the Point::isEmpty and WKT writer.

However, I think it isn’t complete yet as @pramsey pointed out that an all-Nan point could still be constructed by adjusting the coord seq after constructing the point? Or is that not the case?

And on a side note, has there been a discussion about blocking nonfinite values altogether?

dr-jts · 2023-04-30T16:46:15Z

And on a side note, has there been a discussion about blocking nonfinite values altogether?

Yes, several times (probably on some issue, but don't have a reference to hand).

The conclusion is always to follow Postel's Law: Be lenient in what you accept, and strict in what you emit.

The rationale is that the library should be able to read everything, so that it can be validated and subsequently fixed as and how determined.

Geometries containing non-finite ordinates are always considered invalid, via this IsvalidOp logic. (Incidentally, it would be better for this logic to be in Coordinate, to centralize it and make it more visible. Also, shouldn't that method take a reference, since the pointer should never be null? -> @dbaston )

As Paul, says, there is a combinatorial explosion of possibilities if the lib tries to do clever things with handling non-finite values. It would be best to take a rigorous and simple approach to determining how to decide if things are EMPTY internally. It might be simplest to specify that any non-finite X or Y ordinate in the first coordinate indicates an EMPTY CoordinateSequence. That way most geometry operations will Just Work, since they generally always check for EMPTY and take some default action.

dr-jts · 2023-04-30T17:20:31Z

src/io/WKTWriter.cpp

@@ -281,7 +281,7 @@ WKTWriter::appendPointTaggedText(const Point& point, OrdinateSet outputOrdinates
    appendOrdinateText(outputOrdinates, writer);

    const CoordinateXY* coord = point.getCoordinate();
-    if (coord == nullptr || (std::isnan(coord->x) && std::isnan(coord->y))) {


Why not leave this logic in place? Belt-and-suspenders...

But it should be changed to use CoordinateXY.isNull, to centralize this behaviour. (And maybe that should be changed to use OR rather than AND, to be more inclusive).

caspervdw added 3 commits April 28, 2023 14:45

Coerce all-NaN points to empty at construction

064d132

use clear()

a3dca5c

Refactor

dd021d1

caspervdw mentioned this pull request Apr 28, 2023

Add handle_nan to point creation + fix tests with GEOS main shapely/shapely#1811

Merged

pramsey merged commit d054731 into libgeos:main Apr 29, 2023

dr-jts reviewed Apr 30, 2023

View reviewed changes

dr-jts added the Enhancement New feature or feature improvement. label May 1, 2023

caspervdw mentioned this pull request May 1, 2023

Handling of nonfinite ordinate values #885

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Construction of XYZ points with NaN coordinates #881

Construction of XYZ points with NaN coordinates #881

caspervdw commented Apr 28, 2023

pramsey commented Apr 28, 2023

caspervdw commented Apr 29, 2023

caspervdw commented Apr 29, 2023

pramsey commented Apr 29, 2023

dbaston commented Apr 30, 2023

caspervdw commented Apr 30, 2023

dr-jts commented Apr 30, 2023

dr-jts Apr 30, 2023

Construction of XYZ points with NaN coordinates #881

Construction of XYZ points with NaN coordinates #881

Conversation

caspervdw commented Apr 28, 2023

pramsey commented Apr 28, 2023

caspervdw commented Apr 29, 2023

caspervdw commented Apr 29, 2023

pramsey commented Apr 29, 2023

dbaston commented Apr 30, 2023

caspervdw commented Apr 30, 2023

dr-jts commented Apr 30, 2023

dr-jts Apr 30, 2023

Choose a reason for hiding this comment