-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FID filtering on formats like .shp is slow #8590
Comments
Improving that would require non trivial efforts, at least for the SetAttributeFilter() API, since it would require adding a specific behavior in all drivers (or at least the ones where it makes sense, that is the one that declare the OLCRandomRead capability). |
It is in the context of implementing ArrowStream support... using the arrowstream interface it is not possible to use GetFeature(fid) as far as I know. |
…ter in generic GetNextArrowArray(), and use it for FlatGeoBuf one too (when it has a spatial index) (fixes OSGeo#8590)
…ter in generic GetNextArrowArray(), and use it for FlatGeoBuf one too (when it has a spatial index) (fixes OSGeo#8590)
Filtering on fid in an sql statement is quite slow in larger files for binary formats like .shp.
For text based files it is normal to be slow because the entire file needs to be parsed to be able to find the right fids.
For database-like file types like .gpkg it is fast (using "SQLite" dialect) as the fid is the primary key of the table.
For binary files like shapefile it is relatively slow, but the fid is essentially an offset in the file, so I imagine that in theory this could be fast?
I used some files downloaded from here to test it, but any slightly larger files can obviously be used.
Test script:
Output:
The text was updated successfully, but these errors were encountered: