Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure rules and various small improvements/trimming based on discussion around visual summary #65

Merged
merged 34 commits into from
May 29, 2020

Conversation

bdevans
Copy link
Collaborator

@bdevans bdevans commented May 4, 2020

This PR addresses Issue #62.

This Rmd file can serve as a sandbox for easy reordering and rewording under version control before making large changes to the main document. Once finalised, this can then be turned into a figure.

Copy link
Collaborator

@vsoch vsoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - looking forward to seeing this in action!

summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
@vsoch
Copy link
Collaborator

vsoch commented May 5, 2020

This is looking great! I think the one point for discussion that would be good to hear from @nuest @sje30 @psychemedia about is with respect to the one Dockerfile per project rule - it doesn't fit nicely in either section, and it's not clear to me that one Dockerfile is always the best approach. It would make sense to point out if we looked around and saw a lot of researchers unnecessarily creating more than one, but I can't find examples of that (can others?)

summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
Copy link
Owner

@nuest nuest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @bdevans @vsoch ! I made some suggestions and think this exercise also pushes little improvements back to the main text.

summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Show resolved Hide resolved
summary.Rmd Show resolved Hide resolved
summary.Rmd Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
summary.Rmd Show resolved Hide resolved
summary.Rmd Outdated Show resolved Hide resolved
nuest added a commit that referenced this pull request May 8, 2020
also putting some sentences on separate lines
@nuest nuest marked this pull request as draft May 11, 2020 07:41
@nuest
Copy link
Owner

nuest commented May 20, 2020

@vsoch FYI: I had a short call with @bdevans this afternoon and he explained some of the thinking behind the changes. We also discussed #76, where the two of you raise some relevant concerns. I'm on board with both changes (this PR and #76), as you seem to be. I asked Ben to go ahead preparing a larger PR, including the adjustments needed in the main text to reflect the visual summary developed here. We should have that beginning of next week 🤞.

@sje30 Could you hold of with your review another few days until Ben's PR is merged?

@sje30
Copy link
Collaborator

sje30 commented May 20, 2020 via email

@nuest
Copy link
Owner

nuest commented May 26, 2020

@bdevans Do you know when you'll be able to work on the next partial PR? I do not want to put pressure on you, but simply want to make sure I schedule some time to give feedback.

@bdevans
Copy link
Collaborator Author

bdevans commented May 26, 2020

Hi @nuest sorry for the delay - I've got busy with a conference. I've just submitted a PR for the #77. Once that is merged in, I'll aim to reorder the rules for this PR one evening over the next few days (should be done by Friday if not soon before).

@nuest
Copy link
Owner

nuest commented May 29, 2020

Good progress 🥳

@bdevans
Copy link
Collaborator Author

bdevans commented May 29, 2020

Thanks @nuest and sorry for the delay - it's more time-consuming than I expected!

I think that's all the changes made to the text now. I had thought about adding more subheadings to (loosely) correspond to the sub-rules but they wouldn't be necessary throughout the whole paper, so it might be better to leave it as it.

The other change to consider would be to start cutting out more off-topic advice in order to refocus the paper (on reproducibility). I've removed a couple of points in that spirit but further cuts might be controversial. I think it's best if everyone has a final read through now, keeping in mind that shaving off a few less relevant sentences could be helpful. Once everyone is ok with the text, I can start producing the summary figure.

@vsoch
Copy link
Collaborator

vsoch commented May 29, 2020

Was there a movement to change the focus to reproducibility (I think I missed it!)

@bdevans
Copy link
Collaborator Author

bdevans commented May 29, 2020

Hi @vsoch, maybe I misunderstood but I thought that was always the focus given the title. More recently there has been some discussion that the manuscript might be a bit long and the rules are not so "simple" anymore. Therefore, as I went through I tried to keep the focus on "Writing Dockerfiles for Reproducible Data Science" and considered removing text that was more "general tips and tricks for Docker" (however useful they might be). I didn't go far down that road though; I just think it is worth keeping in mind during subsequent read-throughs.

@vsoch
Copy link
Collaborator

vsoch commented May 29, 2020

Yeah I think you’re right, I probably never explicitly framed it like that because I was focused on the data science part. What were you thinking of trimming further (or have you not decided yet?)

@nuest
Copy link
Owner

nuest commented May 29, 2020

Thanks @bdevans !

Re. trimming: @sje30 offered to give the paper a good read again, and @betatim @benmarwick and @psychemedia also will have to give the paper another look anyway, since it changed quite a bit since the first preprint. So I'd invite them to

  1. identify content that could be cut
  2. watch out where subheadings could be helpful

I suggest to merge this now and use a new PR to add the actual graphic, but let the others start reading before you have created the actual visual summary as a graphic.

@vsoch @bdevans fine by you?

@nuest nuest marked this pull request as ready for review May 29, 2020 17:16
@nuest nuest changed the title [WIP] Visual summary Restructure rules and various small improvements/trimming based on discussion around visual summary May 29, 2020
@vsoch
Copy link
Collaborator

vsoch commented May 29, 2020

Fine by me! 🙆

@bdevans
Copy link
Collaborator Author

bdevans commented May 29, 2020

What were you thinking of trimming further (or have you not decided yet?)

There are probably quite a few little bits and pieces (too many to list) if I go back through e.g. like the point I removed about running docker system prune which is very good housekeeping but not so much to do with writing Dockerfiles or reproducibility.

I was just recommending that all co-authors keep their scalpels in hand and a laser focus as they reread it so we hopefully end up with a more concise and focussed document :)

@vsoch
Copy link
Collaborator

vsoch commented May 29, 2020

Ah interesting. I think that particular point on pruning is important to note - it's not technically labeled as a reproducibility thing, but if you don't do it (at least if I don't) your system can suddenly freeze up because you have too many containers. Although it's not perfectly in scope, that's the kind of tidbit I'd appreciate knowing in a paper like this, because it's relevant to my workflow and health of my computer.

@bdevans
Copy link
Collaborator Author

bdevans commented May 29, 2020

A useful tidbit indeed but it felt a bit too far off-piste for the paper to me (too small a hard disk or lack of RAM could cause similar problems but we needn't mention those in the article). However I'm happy to be overruled if people think it's better to keep it. I just try to edit on the basis that the more distantly related points we include, the more the main points will be drowned out.

Of course, this needn't be the end. I actually started a repo a year ago to collect notes on Docker for reproducible science. We could always write a follow-up paper of broader scope e.g. "Docker for (data) scientists: a rough guide"! 😄 🐳

@bdevans
Copy link
Collaborator Author

bdevans commented May 29, 2020

@nuest This PR is marked by you as "changes requested" but I can't see what that's referring to. Has they been dealt with or is there something you still want me to change?

@nuest nuest merged commit bd1afbe into nuest:master May 29, 2020
@nuest
Copy link
Owner

nuest commented May 29, 2020

Maybe we should save the content we remove in a tidbit.md here in the repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants