-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usability of this package and the JuliaGeo ecosystem #1
Comments
Then @yeesian: |
@yeesian: julia> using ArchGDAL
ArchGDAL
julia> dataset = ArchGDAL.read("temperature/bio1.bil")
GDAL Dataset (Driver: MEM/In Memory Raster)
File(s):
Dataset (width x height): 2160 x 900 (pixels)
Number of raster bands: 1
[GA_Update] Band 1 (Undefined): 2160 x 900 (Int16)
julia> ArchGDAL.read(dataset, 1) # reads the first (and only band)
2160×900 Array{Int16,2}:
-9999 -9999 -9999 -9999 -9999 -9999 … -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 … -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 … -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
⋮ ⋮ ⋱ ⋮
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 … -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 … -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 so the main thing it saves you is not having to write the driver code for parsing the file format using ArchGDAL
raster = ArchGDAL.read("temperature/bio1.bil") do dataset
ArchGDAL.read(dataset, 1)
end but I don’t think it’ll be as fast as mmap in VerySimpleRasters.jl/src/grd_file.jl Lines 66 to 82 in 3fc20f4
|
Me: |
@visr : It's the reason we need GDAL as an abstraction layer if we want to support so many formats |
Great! And yes agreed, I’d be perfectly content to support the mmappable ones natively and provide a conversion for the others (via GDAL most likely) |
GDAL allows you to bring-your-own format-specific drivers, but we (JuliaGeo and/or OSGeo) are not at the point where we can allow you to easily swap in a native julian implementation of a driver into GDAL. Nonetheless, if you're interested, here is their tutorial. Their code is X/MIT-licensed, so you can see how they currently do it. E.g. here's Is it really so slow that you need to be implementing your own driver though? You can always file an issue upstream in the GDAL repository if so: it might benefit GDAL users in R/Python/Rust/etc too. |
No it's not that it's slow - it's that I want to avoid the clunky interface that GDAL gives you. Julia's AbstractMatrix stack is so strong - I think much can be said for having a direct julia representation of rasters, with the option of calling GDAL on them once you need something advanced. I would never in my practical work need all the capabilities of GDAL; in fact not any, other than opening different raster formats and possibly reprojection, so for me it's overkill to have to |
Just to be clear: that's just a limitation of an existing julia wrapper, which I'm actively working to address in yeesian/ArchGDAL.jl#76, and not a fundamental limitation of working with GDAL though. You can still easily read in the raster data (see the examples in #1 (comment)) and have access to all the other metadata and functionality for free via GDAL. |
The only reason to be parsing the grd file yourself, like in VerySimpleRasters.jl/src/grd_file.jl Lines 66 to 82 in 3fc20f4
|
Nice. So just to understand, do you not think there isn't scope for the pure-julia mmap approach once you're further with ArchGDAL? By the way,
another reason for doing this is then this package does not need a dependency on GDAL. The GDAL integration could live somewhere else. |
I thought you wrote that GDAL always loads it into memory? The thing is I want a handle on the raster contents, on-disk, that I can access directly (e.g. via getindex) - I don't care about which driver. |
That's just my own implementation of it in https://github.com/yeesian/ArchGDAL.jl/blob/d5ee1fd0c7b6b3932178d1696ac3a6284fab48dd/src/dataset.jl#L259-L271. GDAL doesn't load data into memory by default. But you might run into nasty memory bugs later, e.g. like in JuliaGeo/LibGEOS.jl#58. |
So just to understand what you suggest exactly - scope for separate packages? Integration? Abandon this as soon ArchGDAL is finished with developments in the process of being merged? |
There is currently GeoJSON.jl and Shapefile.jl which natively reads into geojson and shapefile formats too. They are vector data formats, so we borrowed an "informal" interface in python for https://github.com/JuliaGeo/GeoInterface.jl for some degree of unification across those packages. There is a discussion (which you're involved in) on a similar kind of interface for rasters. JuliaGeo has NetCDF.jl too, so it could help e.g. unify plotting for rasters across NetCDF and "R Raster and EHdr" (which this package seems to be oriented around. See https://www.gdal.org/frmt_various.html and the references therein for details.) and Arch/GDAL. But I don't work enough with rasters, or know enough about them to have an informed opinion about the direction it (the raster interface) might take. |
I found this while thinking about writing something similar myself. I agree entirely that better raster handling is pretty fundamental to doing biogeography in Julia! I feel that we need a
I've read some of the other threads talking about something like this, is there a plan for something similar already in progress, or should I just go ahead and implement something? |
I'd happily depend on and adhere to an Abstract interface. I think it's just a question of doing it. In addition to @yeesian it'd probably be useful to speak to @evetion who has GeoRasters. An alternative (possibly better) would be to put the abstract interface here, with the concrete implementation here as a fallback for the lightest case? This package has essentially 0 dependencies (well except RecipesBase but that's essentially a stdlib), to make it possible for more advanced implementations to depend on it at 0 cost. Happy to change this package to accomodate. btw there's also some discussion here JuliaGeo/GeoInterface.jl#16 as any abstract interface should be compatible with the geointerface imho. |
Yeah I was also thinking the interface should include a concrete component for the most basic case. I'll read through those threads too. Unfortunately I'm not that experienced working with rasters and spatial data, I've mostly written models and frameworks that handle matrices, and let others do the actual raster manipulation in R! But that's a short term solution that needs attention soon. I'll put a small base package together in a few weeks (after thesis hand in!) but feedback will be really useful. Isn't there a HDF5 and Mmap dep in this package? I was thinking it should be much more lightweight than that. |
Mmap is part of the Base Julia distribution, so not a dep, just like Dates. HDF5 isn't used - it's just in the project.toml because I was experimenting with it. |
Sure! if you're open to that I can work here. I noticed you mentioned the problem of time as well in the other thread. It's a common problem for me too. I also have to deal with vertical dimension of layered environmental data. So it would be good to define common ways of accessing additional temporal and spatial dimensions. I was going to write a MicroclimateBase package specifically for this, but it would be good if that could depend on a more general base package and just add methods for the specific environmental variables instead of the whole framework for data access. |
Great, you should have push rights now. And yes goot point on the time dimension. |
Hi Rafael,
Most of GeoStats.jl is about generalizing spatial models to work on any
spatial domain type. We currently have a great amount of abstraction and
plot recipes for plotting solutions over domains, spatial data, etc.
At some point I will look again into the JuliaGeo packages to see if some
integration is missing. The ClimateTools.jl package is already trying some
integration with GeoStats.jl to let users run models on NetCDF seamlessly.
Let me know what kinds of models you have in mind and I can tell you if it
fits the current implementation.
…On Mon, May 27, 2019, 04:04 Rafael Schouten ***@***.***> wrote:
Sure! if you're open to that I can work here.
I noticed you mentioned the problem of time as well in the other thread.
It's a common problem for me too. I also have to deal with vertical
dimension of layered environmental data. So it would be good to define
common ways of accessing additional temporal and spatial dimensions. I was
going to write a MicroclimateBase package specifically for this, but it
would be good if that could depend on a more general base package and just
add methods for the specific environmental variables instead of the whole
framework for data access.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1?email_source=notifications&email_token=AAZQW3J4WW45YHH2VP45DLTPXOBY5A5CNFSM4GS55GS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWI65QQ#issuecomment-496103106>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZQW3MS27ZEJQTRLIJDOIDPXOBY5ANCNFSM4GS55GSQ>
.
|
Great! NetCDF integration is something I need too. The packages I'm developing that are specifically related are Dispersal.jl and Cellular.jl for general Cellular automata style models, and dispersal modelling with heterogeneous spatial data and various dispersal and growth processes. GrowthRates.jl provides simple growth rate grids for dispersal modelling from raster climate data (from HDF5), and Microclimate.jl for providing common microclimate data for more complex species distribution modelling, from NetCDF files. They are all in various stages of development, and all need more generalised interfaces, and further abstraction of the NetCDF and HDF5 data sources. Some kind of Honesty, I don't really understand what GeoStats and ClimateTools do! there's a bit of a mismatch in the applications and language we use in ecology/biogeography. I see ClimateTools.jl deals with 3d and 4d data in which something I specifically need for working with microclimates and I'm doing in ad-hoc ways currently. The ClimGrid type also seems to have a lot of what we need for biogeography, but also a lot more that we don't need! But inheriting from some shared abstract type would be good. |
@rafaqz if you're working on Species Distribution Models you may also want to check out @tpoisot 's work here: https://github.com/EcoJulia/SimpleSDMLayers.jl (just started last week) and here: https://gitlab.com/tpoisot/BioClim |
After looking at your repos, in general you might actually be interested in the EcoJulia organisation - the mission is to create a framework for various aspects of community ecology and geographical ecology in Julia. |
I've been thinking about EcoJulia for a while! Was waiting until I had a few things to contribute :) I mostly develop mechanistic SDMs, dispersal models and biophysical modelling tools (also see my other works in progress Photosynthesis.jl and DynamicEnergyBudgets.jl) But this means my data requirements are often a lot more fine grained spatially and temporally than worldclim style data ins SDMLayers. But both of those packages seem like they would good candidates for incorporation into a more abstract approach to raster data? |
The I did put the |
Ok great, that would probably be a good type to inherit from then, although I'll have to get my head around your package more. (I like to add units wherever load NetCDF with known units to memory, so that everything is automatically correct after that) |
IMHO the priority should be to be compatible with / inherit from the JuliaGEO types. Your types are just compatible among your own packages @juliohm , right? |
One issue for me is the "Stats" part of GeoStatsBase! I rarely do statistical modelling, and tools for stats do complicate the package considerably in terms of contributing to it, and orienting the ecosystem of (non-statistical) tools I would be building around it. I was imagining something more like |
Ok I still don't have my head around how JuliaGEO and all these packages relate! Should we be aiming to dproduce some common basis for all (or as many as possible) of these use cases? |
I agree on leaving Missing up to implementations, sometimes I specifically disallow missing in modelling applications as it can be much slower. (Edit: and you specifically can't use missing in dispersal simulations unless you want to output whole grid of missings. But I still want spatial metadata to pass through to the output array...) |
Yeah sounds good, and also essentially many of the functions I define here are part of a minimal interface I'd think, eg. |
It might be good to add some kind of abstract type hierarchy to deal with additional spatial and temporal dimensions, clarifying what N (in ClimateTools.jl deals with 3 or 4 dimensions like this, so N is 3 or 4. Time is always the last dimension, with the first two dimensions being long lat, and the third dimension is elevation for 4d: if ndims(data) == 3
# Convert data to AxisArray
dataOut = AxisArray(data, Axis{Symbol(lonname)}(lon_raw), Axis{Symbol(latname)}(lat_raw), Axis{:time}(timeV))
elseif ndims(data) == 4 # this imply a 3D field (height component)
# Get level vector
plev = ds["plev"][:]#NetCDF.ncread(file, "plev")
# Convert data to AxisArray
dataOut = AxisArray(data, Axis{Symbol(lonname)}(lon_raw), Axis{Symbol(latname)}(lat_raw), Axis{:plev}(plev), Axis{:time}(timeV))
else
throw(error("load takes only 3D and 4D variables for the moment"))
end
``` |
@juliohm just to be explicit I'm 100% in favour of integration across the julia ecosystem, and I think the ecosystem you're singlehandedly building is impressive. I'd just prefer this happened through shared compatibility with the JuliaGeo org. Does this make sense to you? |
Hi Michael, I fully agree. What about we set an online meeting with
everyone to go over some of the base packages containing spatial types to
see what are the most productive paths of development? I'm currently doing
my research single-handed basically because I don't understand what is out
there. Maybe it is an issue of documentation? I should also say that I'm
not coming from a geography background but from a geostats background and
that makes things a little bit more challenging for me to grasp.
I'll try to go over the JuliaGeo packages once again to find out the
overlaps. It would be really interesting if we could gather people online
to present slides or notebooks illustrating the concepts. Or maybe a
general presentation containing all the infrastructure planned for the
future.
…On Mon, May 27, 2019, 12:27 Michael Krabbe Borregaard < ***@***.***> wrote:
@juliohm <https://github.com/juliohm> just to be explicit I'm 100% in
favour of integration across the julia ecosystem, and I think the ecosystem
you're singlehandedly building is impressive. I'd just prefer this happened
through shared compatibility with the JuliaGeo org. Does this make sense to
you?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1?email_source=notifications&email_token=AAZQW3OGJIJKOMFE2SAX4FTPXP4UNA5CNFSM4GS55GS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWKCCMY#issuecomment-496247091>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZQW3IENISCRGTJAE2CMLLPXP4UNANCNFSM4GS55GSQ>
.
|
Would love to attend the meeting - as @mkborregaard said, we are working on |
@juliohm I'm keen to attend such a meeting. It would be good to delay for 2 weeks so that there is time to learn enough about everyones packages, and write a simple API package to have as a discussion point? (I also need to finish my thesis this week) @mkborregaard The number of use cases here makes me think we need a very simple package without an implementation, just for clarity. So starting a fresh package might be better. |
Sure, if you think so, we can always add stuff in from existing packages later |
Heres a prototype base package for discussion: https://github.com/rafaqz/GeoArrayBase.jl/blob/master/src/GeoArrayBase.jl It might need to be called GeoBase as I incorporated non-array geo data types as in GeoStatsBase etc that should use the same methods. As well as basic methods it has some definitions of array dimensions, crs projection formats and calendars that don't seem to be standardised anywhere. They probably belong in other Base packages. |
@rafaqz, the Thanks for putting together a GeoArrayBase.jl proposal. I believe that we also need to concentrate on other spatial configurations in this GeoBase.jl package we are discussing. For example, structured grids, point collections, images, etc. These types are kind of what I have in GeoStatsBase.jl + GeoStatsDevTools.jl, and they serve my current needs. However, they lack the concept of projections, for example. Also, I did not have a chance to include the time dimension in any of those. I don't know if space and time should be mixed actually. I did a major rewrite of the interface in GeoStatsBase.jl to facilitate some code reuse (e.g. plot recipes) and other planned features. In the spatialobject.jl file you can find a general interface for anything spatial (that fits in the concept of point-based measurements). From this abstract type, we can define at least two sub-types, which are still abstract: spatial domain, and spatial data. The domain type is pretty much the same as a general spatial object type. It only materializes the idea of a collection of points with coordinates. It is basically a trick to use the spatial object interface, and still have a new type on which to dispatch. The spatial data type is a domain plus some data. For example, for each point of the underlying domain, we can save properties. Most implementations in GeoStatsDevTools.jl save these properties in a dictionary mapping symbols to flat vectors of values. Note, however, that because we inherit from spatial object, and because we have domain types already implemented, all we need to do is wrap the domain in a new type containing the domain and an extra dictionary. Everything else we get for free regarding spatial queries. This interface is working nicely so far within GeoStats.jl, and I'be happy to go over it in a call if you guys feel it is appropriate. My hope is that at some point these domain types that I am writing can be reviewed and extended to become a separate GeoBase.jl or GeoInterface.jl package. As a final comment, notice that all these types that I wrote above are point-based. I don't have much experience with polygon-based or area-based spatial objects. This GeoInterface.jl package that we are discussing could encompass both classes of spatial data. |
I have an update on this. I ended up putting together DimensionalData.jl as an abstract xarray-like indexing system that can be used to manipulate and plot tagged dimensional data, that suits this problem better than AxisArrays. GeoData.jl inherits from that, but adds spatial metadata, and hopefully integrates with spatial data as you are describing @juliohm, Maybe we could start an issue there to discuss this further if this direction seems useful to you? I've looked at AbstractSpatialObject etc but there is still a lot I don't understand the reasoning for - its a big package. There are some examples, one of which is this (g is a 3d GeoArray): # plot the mean sea surface temperature around Australia from August to December 2002
dimranges = Time((DateTime360Day(2002, 08, 1), DateTime360Day(2002, 012, 30))), Lat(-45:0.5), Lon(110:160)
select(g, dimranges) |> x->mean(x; dims=Time) |> plot Which plots ~correctly with axis labels and metadata even after @mkborregaard I'm not sure how much this covers what you wanted with VerySimpleRasters. The main missing piece is to integrate with NCDatasets, GDAL (or wrapper packages) for specific load/save/conversion. Which will probably require a lot more changes to GeoDims as well... There are lots of other small pieces unfinished as well, and feedback would be very useful. @evetion Although this doesn't look like it will work with affine transforms, it will... But they are optional as a lot of the time we don't need them. |
Thank you again @rafaqz for your efforts! I am still trying to understand to what extent the types proposed can be reused directly within GeoStats.jl. Could you please confirm that the types support non-regularly sampled points? For example, something that looks like this or simple point sets without array-like layouts. Besides this non-regular types, I need to extend the existing types in GeoStatsBase.jl to introduce the notion of volume element. Some geostatistical methods can handle spatial measurements over different support. For example, it is common to have measurements of variable A in one resolution, and measurements of another variable B in a different resolution. In order to handle this scale difference explicitly in statistical models, we need to encode the volume of the measurement. I am starting to think that we don't need types, but a comprehensive traits interface. It will be hard to conciliate all the proposed packages since everyone in JuliaGeo has different use cases. |
Hi I'm really sorry for not engaging more with this. I'm currently in Ecuador and will be travelling various places until mid-September - at which time I will absolutely love to engage fully with this again. |
@rafaqz awesome that you're exploring this space. Haven't had time to look into everything, but just briefly checked out DimensionalData. Have you considered using/working with the folks at https://github.com/invenia/NamedDims.jl and https://github.com/invenia/IndexedDims.jl? They have similar goals, and aim to be the next AxisArray (and are inspired by xarray). |
@juliohm non-regular grids are doable. Dims could hold the matrices and handle dispatch so it would work on regular data types:
There would just need a few methods that dispatch on But the dims with indexible integer lat/long also isn't the usual case... Maybe a wrapper type on the matrix could make thar clearer, Otherwise the main change this approach would mean for GeoStatsBase is either adding/loosening a The other thing is a I'm not sure if you will see these changes as worthwhile! but separating out dims seems pretty powerful for interop to me - number and order of dimensions, grid spacing, coordinates and type (like DateTime etc), bounds, units, affine transforms, and possibly other metadata are all available from one method/field without changing anything else. You can also just add a @mkborregaard if I was travelling in Ecuador I wouldn't be engaging either :) But try the example when you get back the plotting is pretty smooth. |
@visr thanks for pointing that out. I hadn't seen IndexedDims yet, but NamedDims has some structural issues, similar to AxisArrays and the reason I didn't use them. They are locked into using the concrete type NamedDimsArray. Using a wrapper means we cant dispatch on the wrapped type for other purposes, which limits ecosystem potential. I'm also not sure how I would use them with arbitrary indexing ranges, affine maps or convolution matrices described above. In contrast any array type can use DiminsionalData,jl indexing by adding only one method - AbstractDimension is the main component used for dispatch, rather than the array type. There are only 50 line dedicated to AbstractDimensionalArray, mostly basic array interface, and making similar() etc maintain the array type. And this is the actual code for the concrete array type that will update its dims through reduce functions and use plot recipes: struct DimensionalArray{T,N,D,R,A<:AbstractArray{T,N}} <: AbstractDimensionalArray{T,N,D}
data::A
dims::D
refdims::R
end
DimensionalArray(a::AbstractArray{T,N}, dims; refdims=()) where {T,N} =
DimensionalArray(a, checkdims(a, dims), refdims)
# Array interface (AbstractDimensionalArray takes care of everything else)
Base.parent(a::DimensionalArray) = a.data
# DimensionalArray interface
rebuild(a::DimensionalArray, data, dims, refdims) = DimensionalArray(data, dims, refdims)
dims(a::DimensionalArray) = a.dims
refdims(a::DimensionalArray) = a.refdims So adding it to other packages is pretty easy. |
Maybe it's good to bring this up with the devs, you have some good ideas and clear usecases that they are looking for. Just trying to save you maintenance work in the long end ;) It seems that most of your code is not geo specific anyway, which is a good sign. If I understand correctly, IndexedDims.jl wants to allow different kind of indexes by using AcceleratedArrays (great package). I guess making an AffineIndex using that package should be relatively straightforward. |
I'll let them know what I've been working on for sure. But IndexDims is all concrete again while DimensionalData is an abstraction designed to be bolted on to things... So I can write an easy pull request to NCDatasets (etc) to add dimensional indexing, data manipulation and plotting without getting in the way. It will be less work because the abstraction fits better. |
This works for NCDatasets with a very small pull reqest. Just need to make it save back to netcdf and write some tests (and probably handle a lot of other cases I haven't considered yet) It should be easy to do a similar thing for other packages. |
Cross-posting from the Juliageo/meta thread. As there is some consensus that we should have a teleconference about the future of Geo Raster types I started a gdocs document and would invite everyone who would be interested in joining such a teleconf to add their availability to the list so that we can try to fix a date and time to discuss the future common interface for Geo Raster data in Julia. https://docs.google.com/document/d/1ccaSltPDb5n-bLNgWyrotsWXz2n7ckMggOIPejy_GvQ/edit?usp=sharing |
It might be useful to add here that the dimensions are standardized by the Climate Forecasts conventions. Thus, having the time dimension as the trailing dimension is set in this convention (FORTRAN in this case... the C dimensions ordering implementation the reverse http://www.bic.mni.mcgill.ca/users/sean/Docs/netcdf/guide.txn_22.html). Now, the interesting part that I haven't looked is that NetCDF.jl seems to be a wrapper around the C library but it returns the FORTRAN ordering. Maybe because Julia memory mapping is akind to FORTRAN? Anyway, just wanted to point that in climate and forecasting domain, there is some conventions, so it is easier to develop (read "harcode") some decisions. |
Yeah that's right, see also JuliaIO/Zarr.jl#1 (comment) and yeesian/ArchGDAL.jl#37 for some more discussion about this topic. Nice link about the Fortran dimension ordering, that's what we are implicitly doing. |
But the conventions break anyway once you slice the array. Making it arbitrary is better as you get to keep dim labels after slicing. GeoData.jl allready does all this seemlessly. |
@rafaqz do you see scope for folding more of this package into GeoData? |
Grd load/save works although slightly differently, and I have half-finished ascii. Plotting works well, also copied from here originally. I just wrote an Mostly what is missing is the coordinate vector methods like Edit: but yes that is the plan |
That was my suspicion. I do admit that back then, I spent quite a lot of time trying to understand why I was getting F ordering when NetCDF was calling C functions! 😄 |
Ported from Slack. Me:
Over the last week or two I’ve been working on very simple raster implementation in pure julia. The reason for doing that is that when I had a workshop earlier this month, with 30 very interested biogeographers at a conference, I asked them what it would take for them to move to Julia. And they almost all said “convenient raster support”. And - it is my impression that that support is not quite there yet in Julia. Of course you can do amazing things with ArchGDAL, but in truth almost all of my work with rasters involve getting data out of them and using that for analyses, and some plotting.
So, I designed an analysis in R: https://www.dropbox.com/s/n8790r0s63d56hm/R%2Bworked%2Braster%2Bexample.html?dl=0 and set out to create a package that could replicate it in Julia. The result is https://github.com/mkborregaard/VerySimpleRasters.jl and the replicated analysis is here: https://www.dropbox.com/s/d5mc52corz5hm7s/VerySimpleRasters%2Bexample.html?dl=0
All operations happen on disk, out of memory, and the script is very fast even for the relatively big raster in the example (except the predictable waiting time on plots).
There’s still lots to do - so far it uses R’s native raster format as it’s own native format, and supports importing ESRI Ascii grids to that format. Other formats could possibly be provided by GDAL in some way like the old RasterIO package?
I’d really love inputs - how to make this consistent with the Geointerface, is it worth continuing in this direction or have I overlooked some obvious existing capabilities etc.
For anyone wanting to play with the package, notebooks and data, I’ve collected everything in a nice folder here: https://www.dropbox.com/sh/4shiomz5gj4ggx5/AAASQv_K5p2OadhwL-CdRuEja?dl=0
cc @yeesian @visr @c42f
The text was updated successfully, but these errors were encountered: