Skip to content

Latest commit

 

History

History
243 lines (179 loc) · 10.9 KB

version_control.md

File metadata and controls

243 lines (179 loc) · 10.9 KB
title author date output
Version Control with RStudio and GitHub
Brian High
![CC BY-SA 4.0](../images/cc_by-sa_4.png)
ioslides_presentation
fig_caption fig_retina fig_width fig_height keep_md smaller logo
true
1
5
3
true
true
../images/logo_128.png

Learning Objectives

You will learn:

  • What version control is and why you should care
  • The basic operations of the Git version control system
  • How to use Git and GitHub to manage projects
  • How to use Git and GitHub from within RStudio

Version Control

The following is true of most modern version control systems:

  • A method for tracking changes to one or more files
  • Like Apple Time Machine with "Track Changes" of MS-Word
  • Makes changes to one or more files as a single "commit"
  • Works with any type of file, especially plain-text and "code"
  • Allows multiple users to work with the same files concurrently
  • May also be called: revision control systems

Image: Michael Ernst

Version Control Features

Most modern version control systems:

  • Provide logging and status reports for ease of tracking
  • Allow you to compare versions or revert to past versions
  • Let you share files and merge changes from others
  • Handle merging for you, transparently (in most cases)
  • Let you collaborate through a server or website like GitHub
  • Sync changes with the server instead of emailing files around

Image: Michael Ernst

Git: Distributed Version Control

  • Hugely popular, free, open source, and cross-platform
  • Distributed, decentralized design allows offline use
  • Integrated with apps like RStudio and MS Visual Studio Code
  • The version control "engine" behind sites like GitHub
  • Increasingly popular for scientific research projects

Image: Michael Ernst

GitHub: Social Coding

  • The most popular web-based (github.com) host of Git repositories
  • Free personal account for public and private repositories
  • Free team accounts for educators and researchers work
  • Repository browser with syntax highlighting and text editing
  • Integrated issue tracking, stats, and wiki
  • Offers "forking" and "pull requests" for collaboration workflow
  • Hosts GitHub Pages (github.io) and GitHub Gists (gist.github.com)
  • Provides GitHub Desktop (desktop.github.com) and GitHub Atom editor (atom.io)

Installing Git

  • Microsoft Windows does not come with Git installed
  • Apple OS X (macOS) comes with an old version (but it might work okay for you)
  • Installers available from: https://git-scm.com/
  • For the Windows installer, allow changes to the system PATH
  • RStudio searches PATH to find "git", or manually configure its path
  • Git is already installed on most of our departmental servers
  • Git for Windows provides a Bash shell
  • GitHub Desktop is optional

Configuring Git

  • You need to configure your username and email address.
  • Configuring for default editor and color support is nice, too.
  • Run these commands from the "shell" (Bash, DOS, etc.).
git config --global user.name "John Doe"
git config --global user.email [email protected]
git config --global color.ui true
git config --global core.editor nano         # Or your favorite text editor

git config --global credential.helper cache  # Optional: cache password for 15 min. 
git config --global credential.helper 'cache --timeout=3600' # 1 hr timeout

Edit these commands as needed for your name, email, and preferred editor.

If you store them in a shell script, you may easily run them on other systems you may be using. (Or you can just copy, paste, edit and run these commands.)

Git Integration in RStudio

  • RStudio can import (clone) a git project from a server
  • RStudio can perform all common operations from the GUI
    • From the menu: Tools -> Version Control
    • From the Git tab next to the History tab

NOTE:

GitHub features like forking and pull requests are not currently implemented in RStudio's Git features.

Demo: GitHub Operations

We will demonstrate the following operations in GitHub:

Demo: Git Operations in RStudio

We will demonstrate the following Git operations in RStudio:

  • Clone a Git repository into a new RStudio Project
  • Create and edit files in RStudio
  • Add changed files to a Git commit
  • Commit the changes with a Git commit message
  • Push the commit to GitHub

Common Git Operations

Command What it does
clone Copy a repository into a new folder
pull Fetch and integrate changes
add Stage (designate) files to be committed
commit Record changes to the repository
push Send changes to remote repository (server)
diff Show changes between commits
log Show commit logs
status Show the current status of files and changes

Git and GitHub Glossary

  • Repository (repo): A repository is the most basic element of GitHub. They're easiest to imagine as a project's folder.
  • Branch: A parallel version of a repository
  • Clone: Local copy of the repository
  • Commit: An individual change to a file (or set of files). Commits usually contain a commit message.
  • Fork: A personal copy of another user's repository that lives on your account
  • Merge: Merging takes the changes from one branch (in the same repository or from a fork), and applies them into another.
  • Pull: Pull refers to when you are fetching in changes and merging them.
  • Pull Request: Pull requests are proposed changes to a repository submitted by a user and accepted or rejected by a repository's collaborators
  • Push: Pushing refers to sending your committed changes to a remote repository such as GitHub.com
  • Issue: A GitHub feature/bug tracking "ticket" (and disucssion thread)
  • Close: To complete a GitHub Issue or Pull Request workflow

More details here. Thanks to Raphael Gottardo for this slide.

Typical Git and GitHub Workflow

Project initialization

  1. Create a new, empty repo in GitHub
  2. clone to local system or push from existing local repo to GitHub
  3. Create README, LICENSE, and .gitignore files, then push/pull to sync

Project continuation

  1. Add collaborators to GitHub repo if needed
  2. Always do a pull when you begin a work session and before a commit
  3. To commit changes, add changed files, commit with a message, and push
  4. Check your GitHub Issues" and resolve/close issues as separate commits
  5. Check your GitHub Pull Requests, review the code, merge (or not), and close

Exercises

Exercise #1

Clone a git repo into RStudio as a new project.

Exercise #2

Create a GitHub account (if you need to) and use your account to fork a repo.

Exercise #3

Clone your fork from exercise #3 into RStudio, make an edit, push your edit to GitHub.

Exercise #4 (Extra Credit)

Complete exercises #2 and #3 and then create a pull request.

                                                                                        
                                                  ,,                                    
  .g8""8q.                                 mm     db                           ,M"""b.  
.dP'    `YM.                               MM                                  89'  `Mg 
dM'      `MM `7MM  `7MM  .gP"Ya  ,pP"Ybd mmMMmm `7MM  ,pW"Wq.`7MMpMMMb.  ,pP"Ybd    ,M9 
MM        MM   MM    MM ,M'   Yb 8I   `"   MM     MM 6W'   `Wb MM    MM  8I   `" mMMY'  
MM.      ,MP   MM    MM 8M"""""" `YMMMa.   MM     MM 8M     M8 MM    MM  `YMMMa. MM     
`Mb.    ,dP'   MM    MM YM.    , L.   I8   MM     MM YA.   ,A9 MM    MM  L.   I8 ,,     
  `"bmmd"'     `Mbod"YML.`Mbmmd' M9mmmP'   `Mbmo.JMML.`Ybmd9'.JMML  JMML.M9mmmP' db     
      MMb                                                                               
       `bood'

Additional Resources