-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide Interactive Datasets for REPL usage #76
Conversation
This PR triggered the need to think very carefully about the interactions between (i) "scoped" versus (ii) "floating" objects, so I just want to write that up first. I'll provide an update on the changelog later. MotivationThe problem with using do-blocks to manage context (see example with ArchGDAL.read(filename) do dataset1
# dataset1 exists within this scope
end
# we do not have access to dataset1 from here on In the example above, we do not get to see the state of dataset2 = ArchGDAL.read(filename)
# we have access to dataset2 from here on IssuesThe issue with having "floating" GDAL objects is that the ownership of GDAL objects is not properly reflected in the wrappers that we implement in this package (which only has a pointer/handle to the object itself, and does not reference any of the other objects it might have a relationship with). Therefore, if we are not careful, we might encounter "gotchas" such as the following example dataset2 = ArchGDAL.read(filename)
featurelayer2 = ArchGDAL.getlayer(dataset2, 0)
# ...
# dataset2 is no longer referenced by anything from here on
# Julia's GC triggers, and closes dataset2
# now featurelayer2 is left "dangling" and cause segfaults The issue arose because References and OwnershipThe use of do-block sets up a call stack that corresponds to the dependencies [2,3] between "scoped" objects (which is managed inside ArchGDAL.read(filename) do dataset1
# dataset1 exists within this scope
featurelayer1 = ArchGDAL.getlayer(dataset1, 0)
# dataset1 will continue to exist within this scope, so
# any object (e.g. featurelayer1) created within this scope
# will not be left dangling.
end For "floating" objects, we need to think about (a) the objects that depend on them, and (b) the objects that they depend on. For (a), we need to ensure that they are not destroyed while there are still objects that depends on them. For (b), we need to ensure that the objects they depend on are not destroyed within their scope. The analysis is complicated when there are interactions between "scoped" and "floating" objects: to ensure that "floating" objects are not destroyed by the GC prematurely, there should be at least one reference to them. On the other hand, to avoid memory leaks, we do not wish to remove (a) For objects that depend on "floating objects"This might happen when they are needed in the construction of an object, or an object has ones of its attributes being set to the floating object as a value.
(b) For objects that "floating objects" depend onRather than building up a web of dependencies, the "floating" objects maintain only the handle to their own GDAL object. Therefore, if we have methods that retrieves an internal reference, we make a clone/copy of it.
Some Counter MeasuresBy the nature of its approach, ArchGDAL.jl has already been overly permissive with unsafe behavior (see [3,4]), and the introduction of "floating" objects introduces an additional source of risk, so here are some counter-measures I took to manage them.
Footnotes[1]: A similar issue has been encountered in JuliaGeo/LibGEOS.jl#58. [2]: By a dependency, we mean whether or not the "lifetime"/validity/existence of the objects depend on each other. Whenever we have code that looks like [3]: The code in | | |_| | | | (_| | | Version 1.1.1 (2019-05-16)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using ArchGDAL
julia> geom = ArchGDAL.createpoint(1,2)
Geometry: POINT (1 2)
julia> ArchGDAL.importEPSG(4326) do spatialref
ArchGDAL.setspatialref!(geom, spatialref)
print(ArchGDAL.getspatialref(geom))
end
Spatial Reference System: +proj=longlat +datum=WGS84 +no_defs
julia> ArchGDAL.getspatialref(geom)
Segmentation fault: 11 [4]: I'd like to have an API that guarantees safety (through type-checks/etc). Nonetheless, I do not have enough time (within the scope of this project) to perform a comprehensive analysis, so I'm going to let practicality overrule for now. Moreover, there is a large amount of GDAL code out in the wild, and I'd like to keep it easy to "port code" from another repository if we have
|
@visr this is now ready for review |
What's needed to merge this? I'm glad to help if someone can point me in the right direction. |
The code here is good to merge. But it switches to the GDAL.jl 1.0 release which doesn't have a working GEOS linked on Linux. That's why I didn't merge it yet, because it would break geometry operations that work now. Right now there is a PR on Yggdrasil to add GDAL. I think we should finish that, use that build for GDAL.jl (which means support only julia 1.3+). If that is working, we can merge and tag this fairly quickly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great that this is finally coming to completion. We should probably take the changelog written in the first post and add that to the release notes, to make it easier for people to update to the new ArchGDAL.
Looks good to me |
test/test_array.jl
Outdated
count = sum(buffer[:, :, 1:1] .> 0) | ||
@test total / count ≈ 76.33891347095299 | ||
@test total / (AG.height(ds) * AG.width(ds)) ≈ 47.55674749653172 | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be in a different test set as it's not about setindex
so unsafe_getgeomfield is deprecated
so unsafe_getspatialref is deprecated
push! -> addfeature!, addgeom! pushfeature -> addfeature writefeature -> createfeature write! -> setfeature!
…mum/minimum(band)
cherry-picked from #79
* GDAL.C.GDALDummyProgress -> GDAL.gdaldummyprogress * OGRDatumType not supported for now * replace all occurrences of GDAL.C. with their new counterparts * update to new GDAL.jl API * update to new GDAL.jl API * OGRDatumType is removed in GDAL3.0 * Update .travis.yml * Update appveyor.yml * do not wrap GDAL pointers in an additional layer of Ptr * add return keywords * whitespacing * tighten type signatures for constructors * Ptr{GDAL.GDALDatasetH}(C_NULL) -> C_NULL
AppVeyor cancelled all other builds that I want to check
Co-authored-by: Rafael Schouten <[email protected]>
Co-authored-by: Rafael Schouten <[email protected]>
closes #21 #56 #66 #79
This package is currently not very conducive for interactive REPL usage. This PR
createlayer()
andcopy(featurelayer)
), independently of the choice of a dataset. For floating featurelayers, a corresponding dataset will be created in memory.CHANGELOG
new
IDataset
(short for "interactive datasets") which has a finalizer that closes the corresponding file when it goes out of scope. This is in contrast toDataset
which exists only within the scope of a do-block. They both fall under the type hierarchyDataset <: AbstractDataset
andIDataset <: AbstractDataset
.FeatureDefn, IFeatureDefnView <: AbstractFeatureDefn
GeomFieldDefn, IGeomFieldDefnView <: AbstractGeomFieldDefn
FieldDefn, IFieldDefnView <: AbstractGeomFieldDefn
IFeatureDefnView
,IFieldDefnView
andIGeomFieldDefnView
are introduced to model internal references to existing objects. They support only a restricted set of operations.unsafe_
methods will return objects that needs to be cleaned up, and are not meant to be used by users./vsimem/
instead of creating tmp folders and files, and update for new dataset APIsnfield(featurelayer)
andngeom(featurelayer)
different
Now
getspatialref(...)
,getgeom(...)
, andgetspatialfilter(...)
always makes a copy of the object.breaking
fromWKT(data)
andfromWKB(data)
no longer takes in aspatialref
argument.call signatures
createlayer(dataset, name)
->createlayer(name = name, dataset = dataset)
create(filename, driver)
->create(filename, driver = driver)
orcreate(driver, filename = filename)
colortable = getcolortable(band)
->getcolortable(band) do colortable
copywholeraster(source, dest, options)
->copywholeraster(source, dest, options = options)
transform!(coordtransform, xs, ys, zs)
->transform!(xs, ys, zs, coordtransform)
(for consistency withtransform!(geom, coordtransform)
)renamed
registerdrivers(...)
->environment(...)
. It is no longer required for file io, since we provide a driver manager that registers all the drivers during module initialization.setattr!(spatialref, path)
->setattrvalue!(spatialref, path)
(consistent withsetattrvalue!(spatialref, path, value)
)copylayer(dataset, layer, name)
->copy(layer, name = name, dataset = dataset)
(layer
is implied)createcopy(dataset, filename, driver)
->copy(dataset, filename = filename, driver = driver)
(create
is implied)The name
geomfield
sounded too long and confusing (is it a geom? or a field?), so I've decided to rename it to justgeom
:deletegeomfielddefn!(featuredefn, i)
->deletegeomdefn!(featuredefn, i)
gfd = creategeomfielddefn(...)
->creategeomdefn(...) do gfd
getgeomfielddefn(...)
->getgeomdefn(...)
getgeomfield(...)
->getgeom(...)
setgeomfield!(...)
->setgeom!(...)
ngeomfield(...)
->ngeom(...)
Standardize on
get/add|fielddefn/geomdefn/feature
:addgeomfielddefn!(featuredefn, geomdefn)
->addgeomdefn!(featuredefn, geomdefn)
creategeomfield!(layer, geomdefn)
->addgeomdefn!(layer, geomdefn)
createfield!(layer, fielddefn)
->addfielddefn!(layer, fielddefn)
createfeature!(layer, feature)
->addfeature!(layer, feature)
Some method names are too generic-sounding, so I've lengthened them to be clearer.
find(...)
->findstylestring(...)
Too many methods were being called
get<methodname>
without any unifying reason. I've decided to try and rename methods such that all remaining methods that start withget
suggests a method corresponding to a modifier (such asadd
,set
, etc).getblocksize(rasterband)
->blocksize(rasterband)
getdatatype(rasterband)
->pixeltype(rasterband)
getaccess(rasterband)
->accessflag(rasterband)
getnumber(rasterband)
->indexof(rasterband)
getmaskflags(rasterband)
->maskflags(rasterband)
getmaximum(rasterband)
->maximum(rasterband)
getminimum(rasterband)
->minimum(rasterband)
getfidcolname(layer)
->fidcolumnname(layer)
getgeomcolname(layer)
->geomcolumnname(layer)
getlayerdefn(layer)
->layerdefn(layer)
getcurvegeom(geom)
->curvegeom(geom)
getlineargeom(geom)
->lineargeom(geom)
getdim(geom)
->geomdim(geom)
getgeomname(geom)
->geomname(geom)
getpaletteinterp(colortable)
->paletteinterp(colortable)
getsampleoverview(...)
->sampleoverview(...)
getrgba(...)
->toRGBA(...)
getextent(...)
->envelope(...)
getenvelope(geom)
->envelope(geom)
getenvelope3d(geom)
->envelope3d(geom)
getcolumnname(...)
->columnname(...)
getcolumnusage(...)
->columnusage(...)
getcolumntype(...)
->columntype(...)
Some of the index methods are now prefixed with
find
to be suggestive of what is actually happening.getrowindex(rat, pxvalue)
->findrowindex(rat, pxvalue)
getcolumnindex(...)
->findcolumnindex(...)
getfieldindex(...)
->findfieldindex(...)
getgeomfieldindex(...)
->findgeomindex(...)
At the end of it all, we have
which feels a lot more reasonable.
dropped
getfeaturesread(layer)
: not all drivers seem to update this count properly, and advanced users can callGDAL.getfeaturesread(layer.ptr)
starttransaction(layer)
: advanced users can callGDAL.starttransaction(layer.ptr)
rollbacktransaction(layer)
: advanced users can callGDAL.rollbacktransaction(layer.ptr)
committransaction(layer)
: advanced users can callGDAL.committransaction(layer.ptr)
synctodisk!(layer)
: advanced users can callGDAL.synctodisk(layer.ptr)
flushcache!(rasterband)
: advanced users can callGDAL.flushrastercache(rasterband.ptr)
setstyletabledirectly!(feature, styletable)
: usesetstyletable!(feature, styletable)
instead. advanced users can callGDAL.setstyletabledirectly(feature.ptr, styletable.ptr)
.setgeomdirectly!(feature[, i], geom)
: usesetgeom!(feature[, i], geom)
instead. advanced users can callGDAL.setgeometrydirectly(feature.ptr[, i], geom.ptr)
.addgeomdirectly!(geom1, geom1)
: usepush!(geom1, geom2)
instead. advanced users can callGDAL.addgeometrydirectly(geom1.ptr, geom2.ptr)
.setspatialref!(geom, spatialref)
andtransform!(geom, spatialref)
: rather than doingsetspatialref!(geom, spatialref1); transform!(geom, spatialref2)
, docreatecoordtrans(spatialref1, spatialref2) do coordtransform; transform!(geom, coordtransform) end
instead. advanced users can callGDAL.assignspatialreference(geom.ptr, spatialref.ptr)
andGDAL.transformto(geom.ptr, spatialref.ptr)
instead.