Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MovingPandas: Software Submission for Review #18

Closed
15 of 22 tasks
anitagraser opened this issue Jan 6, 2020 · 83 comments
Closed
15 of 22 tasks

MovingPandas: Software Submission for Review #18

anitagraser opened this issue Jan 6, 2020 · 83 comments

Comments

@anitagraser
Copy link

anitagraser commented Jan 6, 2020

Submitting Author: Anita Graser (@anitagraser)
All current maintainers: Anita Graser (@anitagraser)
Package Name: MovingPandas
One-Line Description of Package: Trajectory classes and functions built on top of GeoPandas
Repository Link: https://github.com/movingpandas/movingpandas
Version submitted: 0.2
Editor: Jenny Palomino (@jlpalomino)
Reviewer 1: Ivan Ogasawara (@xmnlab)
Reviewer 2: Martin Fleischmann (@martinfleis)
Archive: DOI
JOSS DOI: N/A
Version accepted: v 0.3.rc1
Date accepted (month/day/year): 03/19/2020


Description

  • Include a brief paragraph describing what your package does:

MovingPandas is a package for dealing with movement data. MovingPandas implements a Trajectory class and corresponding methods based on GeoPandas. A trajectory has a time-ordered series of point geometries. These points and associated attributes are stored in a GeoDataFrame. MovingPandas implements spatial and temporal data access and analysis functions (covered in the open access publication [0]) as well as plotting functions.
A usage example is available at http://exploration.movingpandas.org,

[0] Graser, A. (2019). MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum ‒ Journal of Geographic Information Science 2019, 1-2019, 54-68. doi:10.1553/giscience2019_01_s54. URL: https://www.austriaca.at/rootcollection?arp=0x003aba2b

Scope

  • Please indicate which category or categories this package falls under:
    • Data retrieval
    • Data extraction
    • Data munging
    • Data deposition
    • Reproducibility
    • Geospatial
    • Education
    • Data visualization*

* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.

  • Explain how the and why the package falls under these categories (briefly, 1-2 sentences):

Geospatial (primary): The MovingPandas Trajectory class implements is a spatio-temporal data model for movement data.

Data visualization (secondary): The implemented plot functions enable straight-forward movement data exploration that goes beyond plotting the individual point locations by ensuring that trajectories are represented by linear segments between consecutive points.

  • Who is the target audience and what are scientific applications of this package?

Movement data / trajectories appear in many different scientific domains, including physics, biology, ecology, chemistry, transport and logistics, astrophysics, remote sensing, and more.
For example, the provided tutorials cover the analysis of migrating birds as well as the analysis of ship movement within a port.

  • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

scikit-mobility is a similar package which is also in an early development stage and also deals with movement data. They implement TrajectoryDataFrames and FlowDataFrames on top of Pandas instead of GeoPandas. There is little overlap in the covered use cases and implemented functionality (comparing MovingPandas tutorials and scikit-mobility tutorials). MovingPandas focuses on spatio-temporal data exploration with corresponding functions for data manipulation and analysis. scikit-mobility on the other hand focuses on computing human mobility metrics, generating synthetic trajectories and assessing privacy risks.

  • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:

#14

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • has an OSI approved license
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette (notebook) with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

Publication options

JOSS Checks
  • The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
  • The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI:

Note: Do not submit your package separately to JOSS

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Code of conduct

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

@lwasser
Copy link
Member

lwasser commented Jan 15, 2020

hi @anitagraser !! thank you again for this submission. it will be on our discussion list for this thursday's pyopensci meeting! can you think of any folks who might be well suited to review this package? we will need 2 people.

@anitagraser
Copy link
Author

Thank you @lwasser! I think GeoPandas developers would be a good fit.

@lwasser
Copy link
Member

lwasser commented Jan 16, 2020

@jlpalomino will be the fearless editor for this submission !! And @xmnlab will be our first reviewer. We will reach out to the geopandas folks. @martinfleis would you be interested in being a second reviewer for moving pandas? please let us know!

@martinfleis
Copy link

@lwasser I would love to do that, but not sure how fast I'd be. What is the timeframe?

@lwasser
Copy link
Member

lwasser commented Jan 16, 2020

hey @martinfleis we understand. we typically ask for a 3 week turn around on reviews. Would that timeframe work for you or is that too quick? Many thanks for responding so quickly!

@martinfleis
Copy link

@lwasser that seems to be doable. Count me in.

@lwasser
Copy link
Member

lwasser commented Jan 17, 2020

awesome!! Thanks @martinfleis for doing this!!

@jlpalomino
Copy link
Member

jlpalomino commented Jan 17, 2020

Editor checks:

  • Fit: The package meets criteria for fit and overlap.
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has an OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Thanks @xmnlab and @martinfleis for agreeing to review MovingPandas. Please use the following resources to submit your review:

The submitting author is open to receiving issues and PRs if you want to create a review using that approach (e.g. include links to the issue and/or PR in your review).

Feel free to reach out with any questions about the review process.


Reviewers: Ivan Ogasawara (@xmnlab) and Martin Fleischmann (@martinfleis)
Due date: February 7th, 2020

@martinfleis
Copy link

martinfleis commented Jan 28, 2020

Package Review

  • As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

  • A statement of need clearly stating problems the software is designed to solve and its target audience in README

  • Installation instructions: for the development version of package and any non-standard dependencies in README

  • Vignette(s) demonstrating major functionality that runs successfully locally

  • Function Documentation: for all user-facing functions

    • The documentation is not ideal as there are often missing explanations of options for string inputs of available methods, like in Trajectory.get_position_at() or Trajectory.generalize(). I would personally also prefer having one place with all documentation sources, or at least links between them. Some parts are in Readme, some in examples and other on RTD.
  • Examples for all user-facing functions

    • The repository contains extraordinary Jupyter Notebooks working as a user guide and new are being created. All run locally as well as on mybinder if one wants to play with the data quickly. However, there is no link to all the examples apart from my binder badge. One has to find them in the repository.

    • Minor typo in 3_horse_collar.ipynb: ha is 10 000 not 1 000 m (which is correct at another place).: total_area = total_area[collar_id]/1000

  • Community guidelines including contribution guidelines in the README or CONTRIBUTING.

    • Contribution guidelines seem to be missing. I haven't found them anywhere.
  • Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements
The package meets the readme requirements below:

  • Package has a README.md file in the root directory.

The README should include, from top to bottom:

  • The package name

  • Badges for continuous integration and test coverage, the badge for pyOpenSci peer-review once it has started (see below), a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges, see this example, that one and that one. Such a table should be more wide than high.

    • Code coverage is missing, pyOpenSci peer-review is missing, repostatus.org badge is missing.
  • Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.

  • Installation instructions

  • Any additional setup required (authentication tokens, etc)

  • Brief demonstration usage

  • Direction to more detailed documentation (e.g. your documentation files or website).

  • If applicable, how the package compares to other similar packages and/or how it relates to other packages

  • Citation information

    • There are references to related papers, but it is not clear how should be movingpandas itself cited.

Functionality

  • Installation: Installation succeeds as documented.

  • Functionality: Any functional claims of the software been confirmed.

  • Performance: Any performance claims of the software been confirmed.

  • Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.

    • I'd say that the essential functions are covered, but the overall coverage could be higher. E.g. movingpandas/trajectory_collection.py is covered only from 63%. I am aware that movingpandas/trajectory_aggregator.py is new so I assume that its tests are still TBD.
Name                                               Stmts   Miss  Cover
----------------------------------------------------------------------
movingpandas/__init__.py                               6      0   100%
movingpandas/geometry_utils.py                        45      4    91%
movingpandas/overlay.py                              152     12    92%
movingpandas/tests/__init__.py                         0      0   100%
movingpandas/tests/test_geometry_utils.py             41      0   100%
movingpandas/tests/test_overlay.py                    78      0   100%
movingpandas/tests/test_trajectory.py                208      0   100%
movingpandas/tests/test_trajectory_collection.py      54      0   100%
movingpandas/trajectory.py                           298     29    90%
movingpandas/trajectory_aggregator.py                229    192    16%
movingpandas/trajectory_collection.py                113     42    63%
movingpandas/trajectory_plotter.py                    82     13    84%
----------------------------------------------------------------------
TOTAL                                               1306    292    78%
  • Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

    • I would recommend adding one testing environment with dev versions of key packages (geopandas, shapely) to check for the potential issues with new versions soon enough to fix them. Testing under Windows (e.g. AppVeyor) could also be helpful as there are differences in the behaviour of Python geospatial stack between OS.
  • Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

  • A short summary describing the high-level functionality of the software
  • Authors: A list of authors with their affiliations
  • A statement of need clearly stating problems the software is designed to solve and its target audience.
  • References: with DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

  • The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 5


Review Comments

MovingPandas is a valuable addition to python geospatial stack. Being built on top of GeoPandas GeoDataFrames, its main classes are easy to understand, and the whole work with MovingPandas is very natural and straightforward. I had almost no issues in using it with my data, and everything works as advertised.

Initially, I was a bit confused by released versions of MovingPandas as when I started there was no release on GitHub and PyPI had different version than conda-forge. I would recommend following JOSS recommendation here and trying to keep these 3 (GitHub, PyPI, conda-forge) in sync as GitHub releases automatically send a notification to watching users.

During the review process, I have opened a couple of issues/PRs in the original repository, all linked above this post.

I am excited to see the further development of it as the latest addition of trajectory aggregator looks brilliant. I will certainly follow new releases, and once I have to work with movement data, MovingPandas will be the first choice.

@anitagraser
Copy link
Author

anitagraser commented Jan 29, 2020

Thanks a lot for the thorough review and great feedback, Martin! I'll work on the open issues.

I've been looking for the badge for pyOpenSci peer-review but haven't been able to locate one for ongoing peer review.

@martinfleis
Copy link

I've been looking for the badge for pyOpenSci peer-review but haven't been able to locate one for ongoing peer review.

That is more the question for @lwasser and @jlpalomino, I just copied review template.

@jlpalomino
Copy link
Member

Thanks @martinfleis for your review.

@anitagraser I also was not able to find the badge details in our review guide, so I have made a note to look into where we provide this info.

Here is the badge for pyOpenSci peer review, with the second link being the URL to this issue:
[![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/18)

@anitagraser
Copy link
Author

anitagraser commented Jan 30, 2020

If fixed the remaining README issues: badges movingpandas/movingpandas#53 and citation information movingpandas/movingpandas@56ef608

The last open issue from Martin's review should be the Contribution guidelines.

@xmnlab
Copy link

xmnlab commented Jan 30, 2020

I am planning to review MovingPandas today :)

@anitagraser
Copy link
Author

Contribution guidelines are now available at https://github.com/anitagraser/movingpandas/blob/master/CONTRIBUTING.md

@lwasser
Copy link
Member

lwasser commented Jan 31, 2020

@anitagraser when your package has fully passed both reviews and both reviewers are happy with your addressing their requested changes, we will ask you to add the badge to the readme!! please get in touch with any other questions. @martinfleis THANK YOU for this review!!

@lwasser
Copy link
Member

lwasser commented Jan 31, 2020

one other question @anitagraser are you interested in JOSS? i see you didn't check the box. Joss only requires you to write a very short paper about the package (i can show you the earthpy example) . They accept the pyopensci technical review by default. no worries if you are not interested... but it's a nice citation to have if you are (linked to your orcid id and such).

@anitagraser
Copy link
Author

@lwasser Thank you. Do I understand correctly that I should remove the pyopensci badge from the MovingPandas readme again? Is there a different badge to fulfill the requirement in the review template "the badge for pyOpenSci peer-review once it has started"?

Concerning JOSS, I have been thinking about it but wasn't sure if JOSS sees prior publications as an obstacle:

Graser, A. (2019). MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum ‒ Journal of Geographic Information Science 2019, 1-2019, 54-68. doi:10.1553/giscience2019_01_s54. URL: https://www.austriaca.at/rootcollection?arp=0x003aba2b

@anitagraser
Copy link
Author

anitagraser commented Jan 31, 2020

@lwasser After looking at the earthpy paper you mentioned, I think there should be minimal overlap with the existing MovingPandas paper in GI_Forum. So yes, I'd like to try a JOSS submission.

Work in progress: https://github.com/anitagraser/movingpandas/tree/joss-paper

@xmnlab
Copy link

xmnlab commented Feb 3, 2020

I will finish my review today :)

@anitagraser
Copy link
Author

Thank you @jlpalomino!

Concerning versions: in my experience, it is common practice to increase the version in Github immediately after a release. Think of it as a necessary step for starting work on the next release.

@martinfleis
Copy link

@jlpalomino I am happy with all changes. I thought I indicated it clearly above, but apparently not enough. There is nothing to be resolved from my side.

@anitagraser versions should be ideally in sync between GitHub, PyPI, conda-forge and zenodo.

@anitagraser
Copy link
Author

I've reverted the version number to rc1 movingpandas/movingpandas@bba76fe

@jlpalomino
Copy link
Member

Thanks @martinfleis for your prompt responses about those issues (for final documentation purposes) and for your response related to the versions.

@anitagraser thanks for your action on the version. I checked in with @lwasser about the version difference as well. We have typically followed the version sync suggested by @martinfleis, that the GitHub repository remains the same version and only changes when there is a new release. However, we understand that there could be commits happening before the new release. She has suggested that we open a discussion on discourse to see what others think is best practice for versions.

That said, let's start moving forward with closing this review, while we wait for the community to weigh in. I will post a new comment with the next steps.

@jlpalomino
Copy link
Member

jlpalomino commented Mar 19, 2020

MovingPandas has been approved for pyOpenSci! Thanks @anitagraser for this submission and @martinfleis and @xmnlab for your detailed reviews!

@anitagraser Here are the next steps:

  • Add the badge for pyOpenSci peer review, with the second link being the URL to this issue:
    pyOpenSci
  • Submit a PR to pyopensci.github.io to add MovingPandas to the pyOpenSci package list and to add yourself as a contributor using:
contributor_type:
    - package-maintainer

and

packages-submitted: ["movingpandas"]

If you are interested (not required), you can write a blog post for the pyOpenSci website about MovingPandas (see blog post about pandera) to promote your package.

The last action item is for me to get the process started with JOSS, who will provide more information on their process. I will do that in a new comment.

Please feel free to let me know if you have any questions, and congrats again.

@jlpalomino
Copy link
Member

Hi @arfon pyOpenSci has approved MovingPandas. @anitagraser is interested in JOSS publication, and the draft paper has been reviewed by the pyOpenSci reviewers.

@anitagraser has another publication that she feels demonstrates minimal overlap:
Graser, A. (2019). MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum ‒ Journal of Geographic Information Science 2019, 1-2019, 54-68. doi:10.1553/giscience2019_01_s54.

Please feel free to contact @anitagraser (or myself if needed) for any additional information needed for the JOSS review process. Thanks!

@arfon
Copy link

arfon commented Mar 20, 2020

@anitagraser - feel free to open an issue on the JOSS repository about this paper. On initial inspection I would say that JOSS would not accept a paper about MovingPandas as the earlier publication looks to be describing essentially the same software.

@anitagraser
Copy link
Author

anitagraser commented Mar 20, 2020

Thank you for your feedback, @arfon!
As mentioned in #18 (comment), I wasn't sure if prior publications would be an obstacle. Graser (June 2019) describes the concepts underlying MovingPandas and presents an early unreleased version of the library. (The first release was published later in Sept 2019 https://pypi.org/project/movingpandas/#history.) MovingPandas has evolved considerably since then.
If you think that this is clearly against JOSS requirements, I won't pursue it further.

@arfon
Copy link

arfon commented Mar 20, 2020

@anitagraser - could you summarize the major changes between MovingPandas when Graser (2019) was submitted compared to how it is today?

We do allow for multiple papers for the same piece of software but would expect at least a major release of the software to warrant an additional paper.

@anitagraser
Copy link
Author

anitagraser commented Mar 20, 2020

@arfon Thank you for the clarification!

Graser (June 2019) only describes the Trajectory class and it's data handling functions.

The key improvements since then are:

@anitagraser
Copy link
Author

@arfon Do you suggest moving this over to the JOSS repository?

@lwasser
Copy link
Member

lwasser commented Mar 31, 2020

This is interesting because we need to define what is a major improvement. It seems to me that infrastructure improvements are excellent but they wouldn't warrant a new publication. Releases on pypi and conda forge, etc would not warrant a new publication nor would docs altho we do want to encourage docs for all packages in the review!!

so the question becomes is the plotting and aggregation functionality listed above enough to justify a new publication ? @arfon do we need to bring anyone else in here to help with the decision for JOSS or should i chat with the ropensci folks? We understand if this is not enough of a new release to justify a new publication. We want to ensure that software publication via JOSS is robust and there is not duplication of ideas published. We also want to continue a healthy working relationship with JOSS!!

@arfon
Copy link

arfon commented Apr 1, 2020

Hi @anitagraser and @lwasser, apologies for the slow reply here. @anitagraser - thanks for documenting the changes that you've made to this package since the last paper was published.

After giving this some thought, I don't think we would accept this as a JOSS submission. Partly because of the reason that @lwasser mentions (infrastructure improvements are obviously useful/important but not actually new functionality in the software) but also that the changes in the TrajectoryCollection and supporting classes looks to be relatively modest and should they be part of a standalone package would likely fall into our 'minor utility' category.

This is interesting because we need to define what is a major improvement.

I completely agree. Unfortunately JOSS doesn't have docs to help with this. I'd be open to having a broader discussion to try and define this.

We want to ensure that software publication via JOSS is robust and there is not duplication of ideas published.

Thank you, it's much appreciated and sorry again for the delayed response here.

@anitagraser
Copy link
Author

@arfon Thank you for the clarification!

@lwasser
Copy link
Member

lwasser commented Apr 7, 2020

@arfon thank you . we understand. and we'd be happy to participate in a discussion surrounding what defines a major improvement. It will make things simpler for future reviews as we can provide that information to the maintainer at the beginning if they are interested in JOSS as well!

@anitagraser i think we can close this issue. Are you interested / do you have the time to write a blog post - similar to what Niels did for pandera:

https://www.pyopensci.org/blog/pandera-python-pandas-dataframe-validation

no pressure as i know there is a lot going on now AND you are welcome to take your time doing it. We will then put the word about about your package and link to the post which describes it in more detail! let me know what you think!

@jlpalomino
Copy link
Member

@martinfleis We noticed that you are not listed on our contributors page. When you get a chance, can you please submit a PR to pyopensci.github.io to add yourself as a contributor using:

contributor_type:
    - reviewer

@jlpalomino
Copy link
Member

Thanks everyone! I am officially closing this issue.

@anitagraser feel free to reach out about a blog post on MovingPandas if you are interested!

@lwasser
Copy link
Member

lwasser commented Apr 8, 2020

yea!! great work everyone getting another package through review!!

@lwasser
Copy link
Member

lwasser commented Sep 15, 2022

hey 👋 @jlpalomino @martinfleis ! I hope that you are all well. I am reaching out here to all reviewers and maintainers about pyOpenSci now that i am working full time on the project (read more here). We have a survey that we'd like for you to fill out so we can:

🔗 HERE IS THE SURVEY LINK 🔗

  1. invite you to our slack channel to participate in our community (if you wish to join - no worries if that is not how you prefer to communicate / participate).
  2. Collect information from you about how we can improve our review process and also better serve maintainers.
    The survey should take about 10 minutes to complete depending upon how much you decide to write. This information will help us greatly as we make decisions about how pyOpenSci grows and serves the community. Thank you so much in advance for filling it out.

NOTE: this is different from the form designed for reviewers to sign up to review.
If there are other maintainers for this project, please ping them here and ask them to fill out the survey as well. It is important that we ensure packages are supported long term or sunsetted with sufficient communication to users. Thus we will check in with maintainers annually about maintenance.

Thank you in advance for doing this and supporting pyOpenSci.


@xmnlab i pinged you on another issue. you can just type in two packages that you have reviewed or we can sort this out later as well as you are now an editor too!! :)

@lwasser
Copy link
Member

lwasser commented Sep 15, 2022

@anitagraser my apologies i'm copy/ paste efficient today. can you kindly read the above issue ^^^ and fill out the survey 🔗 HERE IS THE SURVEY LINK 🔗

@lwasser
Copy link
Member

lwasser commented Sep 28, 2022

hey there @jlpalomino @xmnlab 👋 Just a friendly reminder to take 5-10 minutes to fill out our survey . We really appreciate it. Thank you in advance for helping us by filling out the survey!! 🙌

✨ Martin and Anita -- thank you so much for taking the time to fill it out 🙌

🔗 HERE IS THE SURVEY LINK 🔗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: pyos-accepted
Development

No branches or pull requests

6 participants