-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
208 lines (148 loc) · 6.51 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
---
output:
md_document:
variant: markdown_github
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# ncapi
The goal of ncapi is to provide a foundation interface to the NetCDF API that transcends the three overlapping and patchily supported independent packages
`ncdf4`, `RNetCDF`, `rhdf5`, and `rgdal`.
The need for this is discussed here:
http://rpubs.com/cyclemumner/293536
The crux is cross-platform support for NetCDF class, NetCDF-4 and HDF5-compatibles, Thredds servers, and sources
with groups and compound types.
The first useable version of this package would replace the use of `RNetCDF` in https://github.com/hypertidy/ncmeta. A key need is to have
consistent cross-platform support for NetCDF-4 and -3, Thredds servers, compression options and MPI options.
# Interested?
Great!!
I have no idea what I'm doing so any assistance, ideas, testing, encouragement is welcome.
In particular I need help with:
* basic C++, dealing with numbers and characters passed with the NetCDF API
* higher level design and C++ implementations to reduce the maintenance footprint
* specific workflows and sources to test
* cross-platform support, ensuring this builds on unix, MacOS, with rwinlib, and with standard downloads from Unidata for DIY semi-hackers
ROpenSci show how to build RNetCDF via rwinlib, so NetCDF post 1.9.0 looks
in good shape to hit CRAN and require being built against version 4. Ideally,
we weould have CRAN prepared to do the same for rgdal, and even to include
OpenDAP but it seems unlikely there's any communications channel for doing so. I have no idea if a successor to Win/Mac binary support is planned for CRAN, but it's a really obvious hole in the support for the general R community.
I thought I had figured out using `nc_get_vara_type` and `nc_get_var_type` but I hit problems via OpenDAP when getting over ~1000x1000 floating point values, that RNetCDF has no problem with. I might just be building the count array
incorrectly.
Source of interest include:
* classic NetCDF 3 files
* NetCDF 4 files (the ones that can have *groups* and/or *compound-types*)
* HDF5 files
* Thredds, DODS, OpenDAP servers
* files with pathologies, e.g. character string coordinate values, unused dimensions, broken entities
* complex multi-dimensional models
* NetCDF sources that use compound types for examples different to L3BIN NASA ocean colour or KEA
* anything I haven't thought of that you know about ...
# set up notes
* General system requirements (WIP): https://gist.github.com/mdsumner/3a19a0e4342b4067decfc049b4f4ecf5
* package specific details are set in 'DESCRIPTION', 'ncapi-package.r', 'src/init.c', 'src/Makevars'
* Register routines, new in R 2017: https://ironholds.org/registering-routines/, http://dirk.eddelbuettel.com/blog/2017/04/30/#006_easiest_package_registration
* Configure Rcpp - (must import something, and useDynLib in roxygen comments)
When additions/changes made we must update registration with
```{r}
## Presumably this could be makefiled (TODO).
tools::package_native_routine_registration_skeleton("../ncapi", "src/init.c",character_only = FALSE)
```
# Examples
```{r}
f_l3b <- system.file("extdata", "oceandata", "S2008001.L3b_DAY_CHL.nc", package = "ncapi")
f_l3m <- system.file("extdata", "oceandata", "S2008001.L3m_DAY_CHL_chlor_a_9km.nc", package = "ncapi")
f_hydro <- system.file("extdata", "unidata", "madis-hydro.nc", package = "ncapi")
f_hgroups <- system.file("extdata", "unidata", "test_hgroups.nc", package = "ncapi")
u1 <- "http://tds.hycom.org/thredds/dodsC/GLBa0.08/latest/2d"
u2 <- "https://oceandata.sci.gsfc.nasa.gov:443/opendap/MODISA/L3SMI/2016/001/A20160012016032.L3m_R32_SST_sst_9km.nc"
library(tibble)
library(ncapi)
get_fun <- function(x) {
con <- Rnc_open(x)
on.exit(Rnc_close, add = TRUE)
l <- list(as_tibble(Rnc_inq_dimension(con)),
as_tibble(Rnc_inq_variable(con)))
list(l, lapply(l[[2]]$id, function(a) Rnc_inq_vardims(con, a)))
}
get_fun(f_l3m)
get_fun(f_l3b)
##get_fun(f_hydro) ## doesn't work
get_fun(f_hgroups)
#get_fun(u1) ## doesn't work
#get_fun(u2) ## doesn't work
```
# raw examples
```{r L3bin}
library(ncapi)
f_l3b <- system.file("extdata", "oceandata", "S2008001.L3b_DAY_CHL.nc", package = "ncapi")
con <- Rnc_open(f_l3b)
groupids <- Rnc_inq_grps(con)
Rnc_inq_grpname(groupids[1])
lapply(Rnc_inq_grps(con), Rnc_inq_grpname)
Rnc_close(con)
```
Get a large-ish summary of what is in the file (very WIP).
```{r}
example(Rnc_inq)
```
Simple get group names.
```{r func}
get_groups <- function(x, check_exists = TRUE) {
if (check_exists) stopifnot(file.exists(x))
con <- Rnc_open(x)
on.exit(Rnc_close(con), add = TRUE)
groupids <- Rnc_inq_grps(con)
cat(sprintf("returning %i group names\n", length(groupids)))
names <- unlist(lapply(groupids, Rnc_inq_grpname))
if (length(names) < 1) {
names <- "<no groups found>"
groupids <- NA_integer_
}
tibble::tibble(group_id = groupids, group_name = names, source = basename(x), access = Sys.time())
}
files <- list.files(system.file("extdata", package = "ncapi"), recursive = TRUE, pattern = "nc$", full.names = TRUE)
d <- dplyr::bind_rows(lapply(files, get_groups), .id = "source")
print(d)
```
# Thredds
(These sources don't have groups so we need some more functionality to see anything happen).
```{r}
u1 <- "http://tds.hycom.org/thredds/dodsC/GLBa0.08/latest/2d"
get_groups(u1, check_exists = FALSE)
u2 <- "https://oceandata.sci.gsfc.nasa.gov:443/opendap/MODISA/L3SMI/2016/001/A20160012016032.L3m_R32_SST_sst_9km.nc"
get_groups(u2, check_exists = FALSE)
```
Investigate them a little more deeply.
```{r}
con <- Rnc_open(u1)
Rnc_inq_grps(con)
## notice how the connection is the group ID in the degenerate case
summ1 <- Rnc_inq(con)
Rnc_close(con)
con <- Rnc_open(u2)
summ2 <- Rnc_inq(con)
Rnc_close(con)
```
Content summary of hycom **GLBa0.08**.
```{r}
print(summ1)
```
Content summary of **A20160012016032.L3m_R32_SST_sst_9km**.
```{r}
print(summ2)
```
# Archaeology
This project is a progression from past attempts to make sense of this space.
* https://github.com/mdsumner/anc
* https://github.com/hypertidy/rancid
* https://github.com/hypertidy/ncdump
* http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2016-December/009485.html
* https://github.com/RConsortium/wishlist/issues/3
# Code of conduct
Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.