-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support GeoJSON for geo_point #85120
Support GeoJSON for geo_point #85120
Conversation
b21e5ce
to
43f446a
Compare
43f446a
to
bbc61bd
Compare
subParser.nextToken(); | ||
if (subParser.currentToken() == Token.START_ARRAY) { | ||
coordinates = new ArrayList<>(); | ||
subParser.nextToken(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should stricter here. We expect a two / three dimensional double array. WE should only accept that else throw an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do that on line 527. I noticed that the code here from before does a little validation during the parsing and then more validation on the results further down. I think we could do even more validation, like enforcing that you do not mix and match multiple types of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a lot more validation now
@@ -492,6 +519,14 @@ public static GeoPoint parseGeoPoint(XContentParser parser, GeoPoint point, fina | |||
} else { | |||
return point.parseGeoHash(geohash, effectivePoint); | |||
} | |||
} else if (coordinates != null) { | |||
if (geojsonType == null || geojsonType.toLowerCase(Locale.ROOT).equals("point") == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check that we have no other elements in GeoJson (lat, lon or geohash)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've generalized the validation now to cover all types.
} | ||
} | ||
|
||
public void testQueryPointFromMultiPoint() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is a duplicate of the test testQueryPointFromMultiPointFormats
but with the number of supported formats reduced to those supported by the GeoJSON parser (so removing point as double[] and lat,lon string). This is because the current implementation added point geojson to the point parser, instead of adding point parsing to the geojson parser. The question is whether we should do it the other way round. The consequence of adding point support to the GeoJSON parser is that geo_shape
mappers will start understanding the alternative versions of point (double[] and lat,lon string). Is that a good or a bad consequence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've generalized the test, keeping only test data separate (so geo_point tests all four formats, while geo_shape tests only WKT and GeoJSON).
NumberFormatException numberFormatException = null; | ||
String geojsonType = null; | ||
ArrayList<Double> coordinates = null; | ||
class NumberFormatExceptionHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class exists only to keep the old behaviour that NumberFormatException is only handled at the end of parsing. However, there is no obvious reason why we would need that. It seems to me perfectly sufficient to throw the exception at the point that NumberFormatException arises, right? We have a number of other exceptions being thrown immediately during parsing, so why should this particular exception be treated differently. If we allow it to be thrown during parsing, this inner class can get replaced by a simple utility method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On theory I have is that it could be that when throwing exceptions during parsing, it is impossible to continue parsing (because Objects are not closed, or fully parsed), so the entire import (or entire query) is failed. This can be fixed by catching exceptions, finishing parsing the inner object and then re-throw the exception (which is what we see happening with NumberFormatException) and then the outer code can decide whether to catch and continue, or re-throw. I remember seeing some comment somewhere about this case. So perhaps NumberFormatException is considered a case we want to just log and continue, while other exceptions are considered serious enough to break the entire import (or query)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Hi @craigtaverner, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left a small nit, otherwise LGTM
@elasticmachine run elasticsearch-ci/part-2 |
This work was enhanced further in #85442 |
Support GeoJSON for points when the mapper specifies geo_point.
Closes #47815
Closes #84599