-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TWKB read and write implementation #452
Conversation
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Jim Hughes <[email protected]>
Signed-off-by: Gabriel Roldan <[email protected]>
Signed-off-by: Gabriel Roldan <[email protected]>
This class, coming from [1], reads and writes variable length, zig-zag encoded integers, has already been accepted by LocationTech's IP review on the Geogig project, and lets us get rid of the protocol buffers dependency which is only being used to read and write varints. [1] <https://svn.apache.org/repos/asf/mahout/branches/mahout-0.8/core/src/main/java/org/apache/mahout/math/Varint.java> Signed-off-by: Gabriel Roldan <[email protected]>
Fixes #246 BTW |
Pre-computed data from PostGIS. Check the generate_test_data.sql file, which generates .csv files with expected inputs and output for all combinations of geometry type, xy/z/m precision, and other encoding parameters. Signed-off-by: Gabriel Roldan <[email protected]>
Remove the dependency on Google protocol buffers and finish the implementation of TWKB reader and writer. Signed-off-by: Gabriel Roldan <[email protected]>
@aaime I just found there's a TWKB reader in gt-postgis, didn't know that. Made some tests, seems it has some issues though. You might be interested in this implementation that seems more robust. The test case bellow parses a geometry with the JTS and the GeoTools TWKB readers, with this output:
public @Test void checkJtsVsGtReader() throws Exception {
long largeOrdinate = Integer.MAX_VALUE;
String wkt = String.format("POINT ZM(%s.12345678 %s.12345678 1 2)", largeOrdinate,
largeOrdinate + 1);
WKTReader wktReader = new WKTReader();
wktReader.setIsOldJtsCoordinateSyntaxAllowed(false);
Geometry geom = wktReader.read(wkt);
System.err.println("input: " + new WKTWriter(4).write(geom));
twkbWriter.setXYPrecision(7).setZPrecision(7).setMPrecision(7);
byte[] twkb = twkbWriter.write(geom);
Geometry twkbRead = twkbReader.read(twkb);
System.err.println("JTS TWKB reader: " + new WKTWriter(4).write(twkbRead));
Geometry gtTwkbRead = geoToolsTWKBReader.read(twkb);
System.err.println("GT TWKB reader: " + new WKTWriter(4).write(gtTwkbRead));
assertEquals(geom, twkbRead);
assertEquals(geom, gtTwkbRead);
} The GT parser will also choke on TWKB byte strings that contain an "idlist", like one generated with the following SQL:
Ok, it won't com like that from the queries made by gt-postgis, but still. The following test case: public @Test void testParseWithIdList() throws Exception {
byte[] twkb = WKBReader.hexToBytes("040402020400000202");
Geometry expected = new WKTReader().read("MULTIPOINT(0 0, 1 1)");
Geometry parsedJTS = twkbReader.read(twkb);
assertEquals(expected, parsedJTS);
Geometry parsedGT = geoToolsTWKBReader.read(twkb);
assertEquals(expected, parsedGT);
} Fails at the last line with |
It may be that the CoordinateSequenceFactory creates a CoordinateSequence with no measures set even if it was created for 4 dimensions. Explicitly setting the measure fixes it. Test the reader respects the provided GeometryFactory, if any. Signed-off-by: Gabriel Roldan <[email protected]>
Indeed not interested in either cases for the moment. We could switch regardless to the JTS version if it has the same or better performance characteristics (calculations, allocations), one less bit to maintain! :-D |
By the way, both @jodygarnett and @jnh5y were made aware of the GeoTools implementation, you did not talk to the JTS folks before starting this work? |
Oh, the first one is a case of correctness, though it should be an easy fix. Basically where the spec says "varint" it means "varlong", and the gt reader is discarding the longer values, chopping ordinates at 32-bit ints. See how it parsed to
Being the first iteration, been more focused on correctness, so we'll see. Didn't have time to compare parsing performance nor optimize yet. |
I did not! started as a weekend/bored time playground and as I've seen the branch was incomplete I just kept going. In any case the GT one is only a parser, no encoder, and it'd be good to have both in JTS just like there exist for WKT and WKB. |
Nice to see a PR from you @groldan ! Will be a while until I can give this a thorough review. But one thing I think is that this should be moved down into a |
Signed-off-by: Gabriel Roldan <[email protected]>
Signed-off-by: Gabriel Roldan <[email protected]>
Hi @dr-jts, thanks, long time no see.
Agreed and done. |
Hi! Are there chance this PR being reviewed and merged? Having this feature in JTS would be nice 😃 |
I think this can be closed? considering #854 |
I have picked up where James and Jody left for
TWKBReader
andTWKBWriter
and finished it with full compatibility with the implementation available in PostGIS.I've also added a couple optimizations that are present in the spec but that the PostGIS version seems not to follow, for which I would still like to write some more unit tests, and hence I'm submitting this PR as a draft for the time being, for consideration or any discussion needed.
Most test cases use pre-computed data from PostGIS. Check the generate_test_data.sql file, which generates .csv files with expected inputs and output for all combinations of geometry type, xy/z/m precision, and other encoding parameters.
I'll be working on adding more documentation and test cases for the optimizations and then move the PR out of the draft state.
Being my first contribution to JTS I'm looking forward to comments on any aspect of it.
Cheers,
Gabriel.