-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GDAL Raster Tile Index (GTI) driver, and associated gdaltindex improvements #8983
Conversation
e9b5448
to
198202d
Compare
@rouault this looks super interesting. I wonder if using |
198202d
to
5f0a577
Compare
Geoserver folks use name "granule". For us who use Mapserver and gdaltindex tiles is not confusing at all. But maybe it it is a matter of selecting who to confuse. Items sounds like OGC API and assets sound like STAC, much newer inventions than gdaltindex. But maybe they will last longer. Do you suggest renaming gdaltindex? |
0b2a51c
to
7fcddfe
Compare
this is awesome ... 🚀 (I've had several questions and ideas and found everything I need until this) could gdalbuildvrt /tmp/tiles.vrt gcore/data/vrtmisc16_tile1.tif gcore/data/vrtmisc16_tile2.tif
gdaltindex /tmp/tiles.vrt.gpkg /tmp/tiles.vrt -recursive I don't think that's possible yet, without fleshing out the file list for gdaltindex or modifying the layer |
That could be surprising to do that by default. We'd likely need a -consider_vrt_as_file_list or something like that. But even with that, one should not forget that you can do very advanced tricks with the sources in a VRT. You can move things around, like inverting hemispheres, incorporate sources without proper geolocation, have different sources for each band, etc. So an arbitrary VRT cannot be converted to a VRTTI tile index. |
7a8f4e9
to
9ee0034
Compare
ah indeed, thanks - perhaps |
--optfile is evaluated in a generic part of the code (and only for the binary itself, not the utlility-as-a-function), and just adds the arguments of the files as if they had been put in the regular command line. I do believe a special flag would be required to mean "use GetFileList() on the VRT instead of the VRT itself", since I can imagine people could create tile indices with VRTs that would have special behaviour (scaling, etc), and you don't want to use just the source of those VRTs, but the VRT with its specific behavior |
the situation I'm thinking of is list of sources where it's not required to open them to get extent (or footprint perhaps), stac and simple vrt mosaics being obvious ones. the opentopo vrts are good examples. I can't see how other mods to sources could be encapsulated, unless this becomes a feature-table way of expressing VRT generally (which I think is interesting). |
Besides @mdsumner has any observer here give it a try ? |
@rouault at the risk of bike shedding... how about not using "VRT" in the name of this driver so that this new one and the old VRT can be isolated in searches? |
Just TI then ? Or maybe TILEIDX ? |
"TI" won't really be google-able. Just throwing out ideas... "VRI" for "Virtual Raster Index" although it's a bit of a pointless namesince it's an actual index, not a virtual index. "GRI" for "GDAL Raster Index"? Though there's already a "ORI" for "Open Raster Index" or "OTI" for "Open Tile Index" could work? |
Though I guess it references another datasource, so maybe a virtual index isn't such a terrible description. |
GTI could work for me (although one could argue that it is a MapServer tile index :-)) |
Driver renamed to GTI |
Works for me. Back when I started in open source I used to dream of owning a Golf GTI one day 😆 |
the one thing I still have is there's no way to easily create a very large index without opening every file - which is unusably slow, the way I do it is instantiate the file with gdaltindex from one of the sources, then update that vector layer with the bbox footprint geometry and I do that with R here FWIW (still using the original driver name): https://github.com/mdsumner/cog-example/blob/main/data-raw/opentopo_vrt.R (I could do that with python, or C++, or write more UI to work from a table but it's extra work for anyone wanting to try this out, with a slightly odd mix of vector and raster). Ideally I'd like to create the vector source and then translate that to raster GTI, so the geoms could be trivial bbox polygons, or actual footprints in a source-foreign crs of the vector layer. (I had hoped to contribute to at least the fast derivation from a VRT file by now but it's not been possible). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work!
I have just left some minor comments.
There seems to be an issue with how the resolution is handled. My test case is to mosaic a dem, each tile being a one-degree-square tif file. The tif files have varying resolutions depending on the latitude, e.g. at the equator When I run: Extracting a square from the resulting GTI results in the incorrect resolution to be used:
|
you should use a positive value for -tr values |
gdaltindex modified to take the absolute value. |
FlatGeoBuf format with a ``.gti.fgb`` extension, meeting the minimum requirements | ||
for a GTI compatible tile index, detailed later. | ||
|
||
For example: ``tileindex.gti.gpkg`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rouault I'm 👎 on this. A new kind of format hint is more confusing than helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new kind of format hint is more confusing than helpful.
Why ? How would the driver be able to quickly identify (that is without opening the dataset) that a GeoPackage file is actually meant to be used with the GTI driver ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the "GTI:" suffix might work well for command line usages, but the aim is also that users can for example drag&drop such files in QGIS if needed, and thus it needs to look as a regular file, hence the need for a hint in the filename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sgillies would GTI:tileindex.gpkg
be more acceptable? I kind of agree that it could be misleading that a file could be opened with an unexpected driver based solely on its name (one could argue that the .gti part of .gti.gpkg is not the extension)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rouault GeoPackage can already be vector or raster. How does QGIS currently determine which driver to use if you drag and drop? Is peeking in the file really not allowed? It's pretty normal behavior for applications like QGIS, yes?
Is "gti" in the filename a signal to QGIS? And then it uses that info to determine whether to flag GDAL_OF_RASTER
or GDAL_OF_VECTOR
to GDALOpenEx()
? GDAL doesn't have a generic opener that dispatches to one or the other yet, does it? I thought you had to know the data model before you call.
Generally, I believe that more and more special cases cause a degradation of user experience. And increase the complication for projects like Rasterio, that's for sure. From my perspective it would be great if we could avoid new special cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does QGIS currently determine which driver to use if you drag and drop?
It largely delegates to GDAL. Here one point I didn't investigate on QGIS side is if he'd prefer the raster driver (GTI) or the vector driver (GPKG), but if using the QGIS browser, it might propose both. So maybe a bit of tuning needed on QGIS side to make it prefer the raster, unless the user explicitly do "open vector layer"
Is peeking in the file really not allowed?
generally we avoid doing complex queries as much as possible for the GDALDriver::Identify() method. We might look at a signature in the first bytes when there's one, but here that's not possible. For a SQLite DB, that would require sqlite3_open() (which must be avoided as much as possible because of SQLite locking issues) and some non trivial queries
Is "gti" in the filename a signal to QGIS?
no, it is a signal for the OGR GPKG driver not to recognize such file as a raster when GDALOpen() in raster mode is attempted on that file.
Cf the following snippet in the Identify() method of the GPKG driver:
if ((poOpenInfo->nOpenFlags & GDAL_OF_RASTER) != 0 &&
ENDS_WITH_CI(poOpenInfo->pszFilename, ".gti.gpkg"))
{
// Handled by GTI driver
return FALSE;
}
And increase the complication for projects like Rasterio, that's for sure.
Rasterio shouldn't have to care about that at all.The end user provides some opaque string "foo.gti.gpkg" to RasterIO open() method which just delegates that to GDALOpen() without having to know more about it. Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added an extra commit so that the regular GPKG driver is able to open a GeoPackage tiled raster even if its filename ends with .gti.gpkg, so there not be any functional regression due to the introduction of the GTI driver
New installed file: data/gdalvrtti.xsd
…t, -nodata, -colorinterp, -mask, -mo options
…ixel_size, -fetch_md
…:VRTTI metadata domain of a vector layer
Closes #8861
Cf https://github.com/rouault/gdal/blob/vrttileindex/doc/source/drivers/raster/vrtti.rst for the whole description of the new driver. Summary is :
The VRTTI driver is a driver that allows to handle catalogs with a large number of raster files (called "tiles" in the rest of this document, even if a regular tiling is not required by the driver), and build a virtual mosaic from them. Each tile may be in any GDAL supported raster format, and be a file stored on a regular filesystem, or any GDAL supported virtual filesystem (for raster drivers that support such files).
This driver offers similar functionality as the VRT driver with the following main differences:
gdaltindex is made a C callable function GDALTileIndex(), and from Python with gdal.TileIndex()
Cf https://github.com/rouault/gdal/blob/vrttileindex/doc/source/programs/gdaltindex.rst for the gdaltindex enhancements. New options are -overwrite, -vrtti_filename, -tr, -te, -ot, -bandcount, -nodata, -colorinterp, -mask, -mo, -recursive, -filename_filter, -min_pixel_size, -max_pixel_size, -fetch_md
gdaladdo: make --partial-refresh-from-source-timestamp work on VRTTI datasets