Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use shx for indexing into file if available #52

Merged
merged 4 commits into from
Apr 23, 2021
Merged

use shx for indexing into file if available #52

merged 4 commits into from
Apr 23, 2021

Conversation

meggart
Copy link
Member

@meggart meggart commented Apr 23, 2021

Currently, .shx files are ignored when opening a Shapefile and it is assumed that all information is stored in the .shp ion a consecutive way. This led to an error at least in one file in the wild (#51), which would be fixed with this PR.

@visr
Copy link
Member

visr commented Apr 23, 2021

Aha, great find! I noticed that the .shp file from #51 was a bit larger than version that went through GDAL. So there is just some junk in between there somewhere? (Edit, re-reading #51 (comment) I see the answer is yes.)

This is not in https://www.esri.com/content/dam/esrisites/sitecore-archive/Files/Pdfs/library/whitepapers/pdfs/shapefile.pdf, but I guess the common parsing practice is that the .shx offsets overrule the .shp offsets? Does the example file from #51 fail this test:

# Test all .shx files; the values in .shx must match the .shp offsets

It is a good sign that this passes on the existing tests, which have .shx files as well, but it would be good to add unit tests at the level of the functions that you add as well.

@meggart
Copy link
Member Author

meggart commented Apr 23, 2021

Yes, if you think this should be included, even if it is not 100% spec then I will add tests for the added functions. I was hoping to see some coverage results, but this does not seem to be activated for this repo?

@visr
Copy link
Member

visr commented Apr 23, 2021

Yeah I think it would be good to add this. Hard to say how common this is though.

Regarding coverage, on Codecov (https://app.codecov.io/gh/JuliaGeo/Shapefile.jl) it says: "No repository activation required. Simply upload a report and the project activates automatically."

It seems we need to add a few lines to CI.yml, see GDAL vs Shapefile:

- uses: julia-actions/julia-processcoverage@v1

https://github.com/JuliaGeo/GDAL.jl/blob/d44dc737ec16a60d1edb81bb71e0cb5f0666f165/.github/workflows/CI.yml#L43-L46

@meggart
Copy link
Member Author

meggart commented Apr 23, 2021

I have just added these lines to CI.yml, let's see what happens, but it looks like we would need that change on master to see reports. An additional benefit of having this indexing working would be that it is very simple to lazily read features now. We could (in a later PR) add an option to the Handle constructor, so the users can choose if shapes are read all at once when opening the file or lazily when they are accessed.

@juliohm
Copy link
Member

juliohm commented Apr 23, 2021

That would be a tremendous feature to have! 💯

@visr visr requested review from visr and removed request for visr April 23, 2021 16:12
@visr
Copy link
Member

visr commented Apr 23, 2021

Looks good to me! Will tag when CI comes through.

@visr visr merged commit d40f6f0 into master Apr 23, 2021
@visr visr deleted the use_shx branch April 23, 2021 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants