Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submission: smapr #231

Closed
8 tasks
mbjoseph opened this issue Jun 20, 2018 · 24 comments
Closed
8 tasks

Submission: smapr #231

mbjoseph opened this issue Jun 20, 2018 · 24 comments

Comments

@mbjoseph
Copy link
Member

Summary

  • What does this package do? (explain in 50 words or less):

The smapr package discovers, downloads, and extracts global soil moisture data from the NASA SMAP mission, producing raster objects from HDF5 files.

  • Paste the full DESCRIPTION file inside a code block below:
Package: smapr
Type: Package
Title: Acquisition and Processing of NASA Soil Moisture Active-Passive (SMAP) Data
Version: 0.1.2
Authors@R: c(
    person("Maxwell", "Joseph", email = "[email protected]",
            role = c("aut", "cre")),
    person("Matthew", "Oakley", email = "[email protected]",
            role = "aut"),
    person("Zachary", "Schira", email = "[email protected]",
            role = "aut")
    )
Depends:
    R (>= 3.2.5)
Imports:
    httr (>= 1.1.0),
    rappdirs (>= 0.3.1),
    raster (>= 2.5),
    rhdf5 (>= 2.14),
    rvest,
    xml2
Maintainer: Maxwell Joseph <[email protected]>
Description:
    Facilitates programmatic access to NASA Soil Moisture Active
    Passive (SMAP) data with R. It includes functions to search for, acquire,
    and extract SMAP data.
License: GPL-3
LazyData: TRUE
RoxygenNote: 6.0.1
Suggests:
    knitr,
    rgdal,
    rmarkdown,
    roxygen2,
    testthat,
    utils
VignetteBuilder: knitr

  • URL for the package (the development repository, not a stylized html page): https://github.com/earthlab/smapr

  • Please indicate which category or categories from our package fit policies this package falls under *and why(? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):

[e.g., "data extraction, because the package parses a scientific data file format"]

data retrieval and munging - smapr finds and downloads data, and extracts gridded spatial from HDF5 files with the correct projection information.

  •   Who is the target audience and what are scientific applications of this package?  

Soil scientists, ecologists, and people working in the remote sensing domain could use the package to quantify changes in soil moisture, detect droughts, generate covariate layers for species distribution models, and validate other soil moisture data products.

There is another smapr package, but it does not support the breadth of SMAP data products that this package does, and development seems to have ceased ~3 years ago: https://github.com/strongh/smapr. This smapr package has more thorough documentation and tests, allows extraction of multiple soil moisture data products, and has been on CRAN since 2016.

  •   If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Requirements

Confirm each of the following by checking the box. This package:

  • [ x ] does not violate the Terms of Service of any service it interacts with.
  • [ x ] has a CRAN and OSI accepted license.
  • [ x ] contains a README with instructions for installing the development version.
  • [ x ] includes documentation with examples for all functions.
  • [ x ] contains a vignette with examples of its essential functions and uses.
  • [ x ] has a test suite.
  • [ x ] has continuous integration, including reporting of test coverage, using services such as Travis CI, Coveralls and/or CodeCov.
  • [ x ] I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

  • [ x ] Do you intend for this package to go on CRAN? (it's already on CRAN: https://cran.r-project.org/package=smapr)
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI:
    • (Do not submit your package separately to JOSS)
  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
    • The package is novel and will be of interest to the broad readership of the journal.
    • The manuscript describing the package is no longer than 3000 words.
    • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
    • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no gaurantee that your manuscript willl be within MEE scope.)
    • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
    • (Please do not submit your package separately to Methods in Ecology and Evolution)

Detail

  • [ x ] Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

  • [ x ] Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

Anyone with experience in programmatically finding and getting data from a https server, and/or working with HDF5 raster data would be great - perhaps:

  • Jeff Hollister (jhollist)
  • Scott Chamberlain (sckott)
  • Marco Sciaini (marcosci)
  • Michael Sumner (mdsumner)
@noamross
Copy link
Contributor

noamross commented Jun 21, 2018

Thanks for you submission, @mbjoseph! This fits well within our scope.

Editor checks:

  • Fit: The package meets criteria for fit and overlap
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has a CRAN or OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

goodpractice::gp() checks are pretty clean! Please add relevant links to your description and check that long code lines are neccessary:

── GP smapr ────────────────────────────────────

It is good practice to

  ✖ write unit tests for all functions, and all package code in
    general. 6% of code lines are covered by test cases.

    R/download_smap.R:32:NA
    R/download_smap.R:33:NA
    R/download_smap.R:34:NA
    R/download_smap.R:35:NA
    R/download_smap.R:36:NA
    ... and 292 more lines

  ✖ add a "URL" field to DESCRIPTION. It helps users find information
    about your package online. If your package does not have a
    homepage, add an URL to GitHub, or the CRAN package package page.
  ✖ add a "BugReports" field to DESCRIPTION, and point it to a bug
    tracker. Many online code hosting services provide bug trackers for
    free, https://github.com, https://gitlab.com, etc.
  ✖ avoid long code lines, it is bad for readability. Also, many people
    prefer editor windows that are about 80 characters wide. Try make
    your lines shorter than 80 characters

    tests/testthat/test-extract_smap.R:117:1
    tests/testthat/test-find_smap.R:10:1
    tests/testthat/test-find_smap.R:20:1

───────────────────────────────────────────────

One real spelling error from spelling::spell_check_package()

 WORD                 FOUND IN
availble             smapr-intro.Rmd:36

A note: There's a poorly documented CRAN feature for including bioconductor packages using a biocViews: field in DESCRIPTION, which you should add to ease installation for users. See here.

I'll go ahead and seek reviewers.


Reviewers: @ldecicco-USGS @marcosci
Due date: 2008-07-19

mbjoseph added a commit to ropensci/smapr that referenced this issue Jun 21, 2018
See ropensci/software-review#231

- tyop in vignette
- include biocViews for hdf5
- add URL and BugReports links
- fix long lines in tests
@noamross
Copy link
Contributor

Reviewers @ldecicco-USGS and @marcosci assigned. Due date 2008-07-19. Note our review guidelines have recently been consolidated in our development guide.

@noamross
Copy link
Contributor

@mbjoseph You can now add a review badge to your package README:

[![](https://badges.ropensci.org/231_status.svg)](https://github.com/ropensci/onboarding/issues/231)

@mbjoseph
Copy link
Member Author

Great - thanks Noam!

@noamross
Copy link
Contributor

Hi @marcosci and @ldecicco-USGS! Just a friendly reminder that your reviews are due in 1 week.

@marcosci
Copy link

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README
  • Installation instructions: for the development version of package and any non-standard dependencies in README
  • Vignette(s) demonstrating major functionality that runs successfully locally
  • Function Documentation: for all exported functions in R help
  • Examples for all exported functions in R Help that run successfully locally
  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

2.5-3 hrs


Review Comments

General comments

I really like smapr - great work! It was a very pleasant review, the package
works like a charm and did exactly what it promised. I see clear value in the package
and think it will be a great contribution to the geospatial ecoystem of rOpenSci.

The package documentation is comprehensive and well written, the code is neat
and the authors provide rigorous unit testing. Overall, the package is in a
good shape, so the following comments are addressing minor things that could
be improved.

1. Installation

  • I tried the installation on a red hat and ubuntu linux machine - no problem, either with CRAN or github. With @noamross hint on biocViews the installation is now also quite smooth, since you don't need to install the rhdf5 first. Remember to remove this section ofter submtting the next version to CRAN.

  • Nevertheless, for users new to UNIX systems it might be worth stating that one needs to install local libraries for httr to work.
    A short hint on libssl-dev for debian system and [email protected] for brew on maxOS would be sufficient.

2. Documentation

General
  • I think it would be worth to come up with a condensed visualization/text to
    to summarize the workflow of finding - downloading - inspecting -reimporting to R
    (see for example https://github.com/socialcopsdev/rLandsat).
README
  • I think it would be worth the effort to have one or two sentences more after:

    An R package for acquisition and processing of NASA (Soil Moisture Active-Passive) SMAP data

    ... stating what data you can retrieve, so that users directly know something about SMAP without following the link.

  • Maybe move the docker part under installation (as it is an alternative way for that)?

pkgdown

Is there a reason why don't use it? It wouldn't be a necessity from my side,
but I think it's worth the effort as navigating smapr would be way easier than
through the pdf documentation.

Vignette

While you provide good examples on how to use the functions in smapr,
I still think it would be worth to give some examples/links/... on how to
further work with the data retrieved from your package.

I can just speak for the field I am working in, but I think a lot of people there
would like to crop the resulting data in the end. So maybe start something in
the line of:

us <- raster::getData("GADM", country="USA", level=1)
california <- subset(us, NAME_1 == "California")
california <- sp::spTransform(california, proj4string(sm_raster))
sm_crop <- raster::crop(sm_raster, raster::extent(california))
sm_mask <- raster::mask(sm_crop , california)
raster::plot(sm_mask)

I would consider this whole point a bonus, but when I think back starting with
spatial data in R was a lot of googling how to reproject, crop, etc.

3. Code

.Rprofile

Something for lazy people like me:

set_credentials <- function(ed_un, ed_pw){
  write(paste0("Sys.setenv(ed_un = \"",ed_un,"\", ed_pw = \"", ed_pw, "\")"),
        file=file.path(Sys.getenv("HOME"), ".Rprofile"),append=TRUE)
}

An auxiliary function like that would help to simplify the need for setting the oauth credentials. Plus, new users don't have to locate .Rprofile and/or one can set it easily on remote machines.

4. Future proofing

Just to bring that on your radar: https://github.com/r-spatial/stars/.
If it hits CRAN I think this would be a valuable replacement for rhdf5 and raster in your package, as it could streamline the whole handling process of SMAP data in general.

This also came up in our submission here and I only found an issue discussing sf over sp for packages onboarding rOpenSci.
But in your case I think this is more relevant than it was for us.

5. Smaller ToDos

To save @noamross some work:

  • You should run codemetar::write_codemeta() for smapr to create a CodeMeta file.
  • If you decide to use pkgdown, you should put the url in your description (you can comma seperate two urls there)
  • You can put a template for issues in your package as github dotfiles

6. Conclusion

That's a rather short review 😮 ... I would blame the quality of the package 😉
Going through the reviewing guidelines, there is nothing I haven't stated here that could be improved.
The code style is very clear, there are no duplications and getting to know the package as a user was pleasant. Other than donating money to NASA for faster servers and hence faster downloading speed, I do not see any possibilities to improve the speed of the functions. The functions and variables all follow rOpenSci guidelines.

@noamross
Copy link
Contributor

Thanks for your review, @marcosci! @mbjoseph, feel free to wait for a second review before addressing comments.

@ldecicco-USGS
Copy link

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README

The description is clear of what the package does. It wouldn't hurt to add a sentence or two right at the beginning expanding on the type of data and type of processing.

  • Installation instructions: for the development version of package and any non-standard dependencies in README

My eyes swept over the Bioconductor instructions, so at first, it took me a bit to figure out how to install the rhdf5 package. Could you add the code directly?

source("https://bioconductor.org/biocLite.R")
biocLite("rhdf5")
  • Vignette(s) demonstrating major functionality that runs successfully locally

Before Finding the data, it would probably be a good idea to have a short section on going to the NASA page to get a user name and password. The error message however was very useful, so I knew exactly what to do.

When I ran:

available_data <- find_smap(id = 'SPL4SMAU', dates = '2018-06-01', version = 3)
Error in rvest::html_table(nodes)[[1]] : subscript out of bounds

Debuging, I got to this line:

top_level_response <- GET(path, auth())

and it appeared to be a 401. I eventually realized I had typed the wrong password. I think you could check on that response and make the error message there more clear. Once I got that straighted up, it was off to the races...I thought.

The download_smap function took a long time. I'd add a quick sentence in the vignette to give the user an idea on how long it might take (or how big the files are). Any indicator messages would be nice here too (am I stuck? did it start? did it get through some of those available_data?).

The README explains what id to use, but the vignette does not. I'm not a big fan of copy/paste, but it's easier to access the vignette once I'm already in R (compared to the readme). Could you add that "Dataset id" table (with the id's) so I can think about looking for more data?

The vignette provided exactly the information needed to run the functions in the package. That being said...I was really curious what I could do next. Could you show how to filter the data into some region?

  • Function Documentation: for all exported functions in R help

In the find_smap arguments, "id" gives one example. It would be useful to say something like "See Details for a description of all ids" (or something like that....)

  • Examples for all exported functions in R Help that run successfully locally

The examples worked!

  • Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Functionality

  • Installation: Installation succeeds as documented.
  • Functionality: Any functional claims of the software been confirmed.
  • Performance: Any performance claims of the software been confirmed.
  • Automated tests: Unit tests cover essential functions of the package
    and a reasonable range of inputs and conditions. All tests pass on the local machine.
  • Packaging guidelines: The package conforms to the rOpenSci packaging guidelines

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 2


Review Comments

Nice package!

Do you have a connection to the data? I'm going to guess you'll get lots of questions on the data itself and why the services are down (if they go down!). Those requests usually are phrased "smapr is broken!!!!".

No matter what your relationship is, it would be a good idea to make sure you the data providers know this package is out there. They may be interested in having you add a "user agent" for example, so they (and you) can get an idea of how often data is downloaded from the smapr package.

Otherwise, the code looks nice, the style is consistent, the test coverage is excellent.

@mbjoseph
Copy link
Member Author

Thanks @marcosci and @ldecicco-USGS for your reviews! This is all great feedback.

@noamross
Copy link
Contributor

Thanks your review @ldecicco-USGS! @mbjoseph, if you use issues in your own repo to track how you address the reviews, please summarize the changes here when you are done.

@mbjoseph
Copy link
Member Author

Will do @noamross - thanks!

@mbjoseph
Copy link
Member Author

mbjoseph commented Aug 1, 2018

Thanks again @marcosci, @ldecicco-USGS, and @noamross for your time and reviews. Here are the point-by-point changes:

Review 1 from @marcosci

1. Installation

2. Documentation

3. Code

4. Future proofing

Thanks for pointing out the stars package! I'll keep my eye on it as a potential replacement for rhdf5 and raster.

5. Smaller ToDos

Review 2 from @ldecicco-USGS

Documentation

My eyes swept over the Bioconductor instructions, so at first, it took me a bit to figure out how to install the rhdf5 package. Could you add the code directly?

Shoot! Fortunately this shouldn't be needed anymore with the biocViews() edit that @noamross suggested.

Do you have a connection to the data?

Not really - other that we work in the same city! The National Snow and Ice Data Center (which hosts the data) is aware of the package, however.

@mbjoseph mbjoseph closed this as completed Aug 1, 2018
@mbjoseph mbjoseph reopened this Aug 1, 2018
@mbjoseph
Copy link
Member Author

mbjoseph commented Aug 1, 2018

Sorry - just accidentally hit "Close and comment"! The above summarizes the changes that were made. I really appreciate everyone's help with this - the package is much improved now thanks to these detailed (and friendly 😃) reviews!

@noamross
Copy link
Contributor

noamross commented Aug 1, 2018

Thanks, @mbjoseph! @marcosci and @ldecicco-USGS, please let us know if Max's changes address your comments.

@ldecicco-USGS
Copy link

They sure do. Great job @mbjoseph !

@marcosci
Copy link

marcosci commented Aug 3, 2018

Superb @mbjoseph ! Just had a look at everything, looks perfect! Well done.

@noamross
Copy link
Contributor

noamross commented Aug 3, 2018

Quick note from a final set of package checks, @mbjoseph: Please add .github to your .Rbuildignore

Once you've done that, approved! Thanks for submitting and @marcosci and @ldecicco-USGS for your reviews!

To-dos:

  • Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.
  • Add the rOpenSci footer to the bottom of your README
    " [![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)"
  • Fix any links in badges for CI and coverage to point to the ropensci URL. We no longer transfer Appveyor projects to ropensci Appveyor account so after transfer of your repo to rOpenSci's "ropensci" GitHub organization the badge should be [![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname).
  • We're starting to roll out software metadata files to all ropensci packages via the Codemeta initiative, see https://github.com/ropensci/codemetar/#codemetar for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.

Should you want to awknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. (https://ropensci.org/blog/). If you are interested, @stefaniebutland will be in touch about content and timing.

We've started putting together a gitbook with our best practices and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

@mbjoseph
Copy link
Member Author

mbjoseph commented Aug 3, 2018

Great - thanks @noamross! I have updated the .Rbuildignore file (ropensci/smapr@9a6e31c), transferred the repo to the ropensci org, added the footer (ropensci/smapr@a8f9198), updated badges (ropensci/smapr@acfe300), and updated the Codemeta file (ropensci/smapr@dbf3db6).

I'd love to add @ldecicco-USGS and @marcosci as reviewers if they are comfortable with it.

I'd also be happy to write a blog post about smapr, and I'll take a look at the onboarding section of the gitbook!

@noamross
Copy link
Contributor

noamross commented Aug 3, 2018

Excellent. @stefaniebutland will be in touch about the blog schedule, in the meantime instructions for putting together post are here. https://github.com/ropensci/roweb2/blob/master/readme.md. I'm closing the issue but feel free to carry on that conversation here.

I've made you admin of the repo over the the rOpenSci org. Your other collaborators have write permissions. Feel free to increase their permission levels if needed be.

@noamross noamross closed this as completed Aug 3, 2018
@stefaniebutland
Copy link
Member

Hello @mbjoseph. Glad to hear you're interested in contributing a post. Our hope is that it gets more eyes on your work!

This link will give you many examples of blog posts by authors of onboarded packages so you can get an idea of the style and length you prefer: https://ropensci.org/tags/review/. Technotes are here: https://ropensci.org/technotes/.

As Noam said, here are some technical and editorial guidelines for contributing a blog post: https://github.com/ropensci/roweb2#contributing-a-blog-post. We ask that you submit your draft post via pull request a week before the planned publication date so we can give you some feedback.

At this point I have open publication slots in September. Do you want to pick a date for draft submission?

Happy to answer any questions.

@mbjoseph
Copy link
Member Author

mbjoseph commented Aug 7, 2018

Heya @stefaniebutland - September sounds great! How about Sep 15 deadline to submit a draft?

@stefaniebutland
Copy link
Member

draft Sep 15 sounds great. I'll mark my calendar to ping you here then. Don't hesitate to ask questions here, or on Slack now that you're there :-)

@stefaniebutland
Copy link
Member

@mbjoseph Are you still good with a Sep 15 draft for publication 2018-09-25?

@mbjoseph
Copy link
Member Author

mbjoseph commented Sep 6, 2018

@stefaniebutland yes - thanks for checking!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants