Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed improvements in WKB geometry import, and ogr2ogr #8560

Merged
merged 4 commits into from
Oct 19, 2023

Conversation

rouault
Copy link
Member

@rouault rouault commented Oct 13, 2023

  • OGRLineString/Polygon/MultiPolygon/MultiLineString: make it possible to run importFromWkb() on the same object and limiting the number of dynamic memory (re)allocations

    • For lines, we keep track of the maximum capacity of the point array,
      with a new member variable.
      So doing setNumPoints(10), then setNumPoints(3), then setNumPoints(9)
      just allocates an array of size 10.
    • For polygons, if importing a WKB for a single ring polygon on top of
      an existing single ring polygon, reuse the existing ring (and benefit
      from the previous optimization)
    • For multipolygons, if importing a WKB for a single part multipolygon
      on top of an existing single part multipolygon, reuse the existing
      polygon (and benefit from the previous optimization)
    • Similar for multilinestring
  • WriteArrowBatch() generic implementation: reuse existing geometry object to save memory allocations

  • ogr2ogr: use Arrow interface as soon as the input driver declares OLCFastGetArrowStream, even if the output driver doesn't declare OLCFastWriteArrowBatch

    This helps in a notable way for Parquet -> GPKG or GPKG -> GPKG when
    disabling the creation of the spatial index which is now the major time
    consumer.

    Now (using Arrow API):

    $ time  ogr2ogr  out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress
    0...10...20...30...40...50...60...70...80...90...100 - done.
    
    real    0m12,868s
    user    0m13,338s
    sys     0m1,843s
    

    Without use of Arrow API:

    $ time  ogr2ogr  out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress --config OGR2OGR_USE_ARROW_API NO
    0...10...20...30...40...50...60...70...80...90...100 - done.
    
    real    0m17,625s
    user    0m15,917s
    sys     0m1,704s
    

@rouault rouault added this to the 3.8.0 milestone Oct 13, 2023
@tbonfort
Copy link
Member

nice speedup! could you give the before/after outputs of /usr/bin/time instead of the builtin time in order to also show memory consumption evolution?

@tbonfort
Copy link
Member

my apologies Even, rewording my comment as it was not my intension to be pushy:
fyi, I believe that using /usr/bin/time instead of the builtin time function would also print out memory usage information
regards 😀

@rouault
Copy link
Member Author

rouault commented Oct 13, 2023

@tbonfort Very relevant question, that made me realized the fix of fcbed4e was needed ....

Without the optimization using the GetNextFeature() + CreateFeature() looping strategy
$ /usr/bin/time -v ogr2ogr out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress --config OGR2OGR_USE_ARROW_API NO
0...10...20...30...40...50...60...70...80...90...100 - done.
Command being timed: "ogr2ogr out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress --config OGR2OGR_USE_ARROW_API NO"
User time (seconds): 18.56
System time (seconds): 1.78
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:20.38
Maximum resident set size (kbytes): 170408

With optimization, using the default 4-thread strategy in the GeoPackage driver, to prefetch the input data (ie the GeoPackage driver not only acquires a single batch of features when requested but spawns threads to fetch the next ones):
$ /usr/bin/time -v ogr2ogr out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress
0...10...20...30...40...50...60...70...80...90...100 - done.
Command being timed: "ogr2ogr out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress"
User time (seconds): 14.38
System time (seconds): 1.92
Percent of CPU this job got: 117%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.90
Maximum resident set size (kbytes): 656096

With optimization, prefetching limited to 1 thread:
$ /usr/bin/time -v ogr2ogr out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress --config GDAL_NUM_THREADS 1
0...10...20...30...40...50...60...70...80...90...100 - done.
Command being timed: "ogr2ogr out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress --config GDAL_NUM_THREADS 1"
User time (seconds): 14.29
System time (seconds): 1.50
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:15.81
Maximum resident set size (kbytes): 332876

The amount of memory is indeed larger with the use of the Arrow API, which is expected. By default we acquire batches of a number of features up to the value of the -gt parameter, which defaults to 100,000.

When limiting to 10,000:

$ /usr/bin/time -v ogr2ogr out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress --config GDAL_NUM_THREADS 1 -gt 10000
0...10...20...30...40...50...60...70...80...90...100 - done.
Command being timed: "ogr2ogr out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress --config GDAL_NUM_THREADS 1 -gt 10000"
User time (seconds): 14.39
System time (seconds): 1.56
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:15.96
Maximum resident set size (kbytes): 497184

I cannot really make sense of that last result. Why would decreasing the size of the batch increase the RAM usage... ?

@rouault rouault force-pushed the arrow_write_perf_improvements branch 2 times, most recently from d76590e to 314feb3 Compare October 13, 2023 18:09
…to run importFromWkb() on the same object and limiting the number of dynamic memory (re)allocations

- For lines, we keep track of the maximum capacity of the point array,
  with a new member variable.
  So doing setNumPoints(10), then setNumPoints(3), then setNumPoints(9)
  just allocates an array of size 10.
- For polygons, if importing a WKB for a single ring polygon on top of
  an existing single ring polygon, reuse the existing ring (and benefit
  from the previous optimization)
- For multipolygons, if importing a WKB for a single part multipolygon
  on top of an existing single part multipolygon, reuse the existing
  polygon (and benefit from the previous optimization)
- Similar for multilinestring
…FastGetArrowStream, even if the output driver doesn't declare OLCFastWriteArrowBatch

This helps in a notable way for Parquet -> GPKG or GPKG -> GPKG when
disabling the creation of the spatial index which is now the major time
consumer.

Now (using Arrow API):

```
$ time  ogr2ogr  out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress
0...10...20...30...40...50...60...70...80...90...100 - done.

real    0m12,868s
user    0m13,338s
sys     0m1,843s
```

Without use of Arrow API:

```
$ time  ogr2ogr  out.gpkg nz-building-outlines.gpkg -lco spatial_index=no -progress --config OGR2OGR_USE_ARROW_API NO
0...10...20...30...40...50...60...70...80...90...100 - done.

real    0m17,625s
user    0m15,917s
sys     0m1,704s
```
@rouault rouault force-pushed the arrow_write_perf_improvements branch from 314feb3 to bfab3f6 Compare October 13, 2023 22:34
@rouault rouault merged commit e706912 into OSGeo:master Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants