PSYC 81.09 (Storytelling with Data) is organized into 2 main parts. Part I comprises four modules, and is collectively aimed at introducing students to the process of creating "data stories" using Python data science tools:
- Module 1: What makes a good story?
- Module 2: Visualizing data
- Module 3: Python and Jupyter notebooks as a medium for data storytelling
- Module 4: Data science tools
Part II is project-based, and revolves around mini data science projects. For each project, one or more students choose a question and dataset to explore and turn into a data story. Each week students and groups will report on their progress with the latest iterations of their stories. Students should aim to participate in three or more projects during Part II of the course. At students' discretion, those three (or more) projects may comprise the same questions and/or datasets (e.g., whereby each story builds on the previous story), or multiple questions and/or datasets that may or may not be related. In addition, students are encouraged to build off of each others' code, projects, and questions. Projects and project groups should form organically and should remain flexible to facilitate changing goals and interests.
Lecture recordings are denoted in bolded links below.
- Welcome message
- Learning remotely: tools, tips, and tricks for engaging with online aspects of the course.
- Course syllabus
- Discussion: the pursuit of truth
- Introduction to storytelling (Source: Khan Academy's The art of storytelling)
- What makes a great story? (Source: Khan Academy's The art of storytelling)
- Exercise: Telling a story about a vivid memory (Source: Khan Academy's The art of storytelling)
- Structuring stories to make them effective communication tools (Source: Khan Academy's The art of storytelling)
- Using pictures to tell a story (Source: Khan Academy's The art of storytelling)
- Pitching your story (or idea!) (Source: Khan Academy's The art of storytelling)
- Giving constructive feedback (Source: Khan Academy's The art of storytelling)
- Using feedback to improve your story (Source: Khan Academy's The art of storytelling)
- Assignment 1: tell the class a 5-minute story You may find some inspiration by taking a look at some Moth Radio Hour stories.
- Introduction to representing data (Source: Khan Academy's Representing Data course)
- Designing effective scientific figures (Source: Aiora Zabala)
- Maximizing the data-to-ink ratio (Source: medium.com)
- Take a look at A Layered Grammar of Graphics by Hadley Wickham (skim or skip the code examples) and The Grammar of Graphics by Leland Wilkinson.
- Discussion: telling effective stories about data
- Communicating with sound: in-class discussion with director, editor, and producer Sam Green and class "field trip" to a showing of 32 SOUNDS (Special thanks to the Hopkins Center for the Arts at Dartmouth for donating tickets to all PSYC 81.09 students in the W24 term!)
- Assignment 2: data story remix You find some inspiration by taking a look at one or more of the following sources of data stories:
- Workshopping data story ideas
- Discussions about data stories (part 1)
- Discussions about data stories (part 2)
- Discussion: Introduction to programming
- Getting set up on Google Colaboratory (Source: Introduction to Programming for Psychological Scientists)
- Project management and version control with git and GitHub:
- Overview of Git and GitHub
- Git basics part 1: fork, clone, and status
- Git basics part 2: add, rm, mv, commit, push, and pull
- Intermediate git: ignore, revert, checkout, branch, merge, and remote
- Handling git merge conflicts
- GitHub project management tools
- Optional (ungraded) assignment: GitHub Fundamentals [Accept assignment]
- Introduction to using the command line (Source: codecademy)
- High-level introduction to Python
- Resources for learning Python:
- Introduction to Programming for Psychological Scientists
- Codecademy's Introduction to Python (Source: codecademy)
- Learning to code with Python and Jupyter Notebooks (Source: introtopython.org)
- DataCamp has generously donated free access to all course materials for Dartmouth students enrolled in Storytelling with Data. I have pinned an invite link to the
#general
channel in Slack.
- Assignment 3: Binary converter [Accept assignment]
- Python continued: list comprehensions and decorators
- Assignment 3 Q&A, introduction to Python modules, preview of Python data science stack [slides]
- Modules and Packages (from Whirlwind Tour of Python by Jake VanderPlas)
- Numpy and Pandas (from Python Data Science Handbook by Jake VanderPlas)
- Introduction to NumPy [slides]
- Introduction to Pandas [slides]
- (Optional) practice problems for NumPy and Pandas
- Data visualization overview
- More details on plotting libraries: Matplotlib and Seaborn (from Python Data Science Handbook by Jake VanderPlas)
- Grammar of graphics in Python (Source: towardsdatascience.com)
- Visualizing high-dimensional data with Hypertools (Source: hypertools.readthedocs.io)
- Interactive lecture: exploring and visualizing a sample dataset [notebook]
- Assignment 4: tell your first (notebook-based) data story!
- Discussion with Climate Interactive (2024)
- Discussion with Vermont Department of Health (2022) [slides] [sample notebook]
- Story ideas workshop and brainstorm
- Debugging session (Part I)
- Debugging session (Part II)
- Debugging session (Part III)
- Story critiques
We will spend Part II of the course repeating three general steps in the storytelling process (a video introduction to Part II may be found here):
- Pitching and brainstorming. You'll present your ideas to your classmates, form groups, workshop story ideas.
- Brainstorm session
- Refinement. We'll workshop your (and your group's) ideas and code. You can also use this time to bring up new content ideas that you'd like to learn more about.
- Critiquing. As a class we will discuss your story and provide constructive feedback. We'll also go through your code and discuss any relevant coding issues (e.g., challenges, clever hacks, etc.) that might be relevant to the class.
You should plan to make it through this cycle at least three times during Part II of the course (i.e., you should produce at least 3 data stories).
Each data story should be contained in a single sub-folder of data-stories. Your project should comprise the following files, based on this project template:
- A README.md markdown file based on this template. The README file should contain:
- A project description and overview.
- A link to a YouTube video of your (5 minute) data story. (A playlist containing the current set of data stories may be found here.)
- Links to the data you analyzed.
- Instructions for replicating your results.
- A description of how someone could contribute to your project.
- Acknowledgements and citations.
- Your data story (hosted on YouTube and cited in your README file). Note: you don't need to upload the source video, but if you use any images or slides you should include them in a sub-folder.
- Your project's code (e.g., notebooks, Python scripts, etc.), based on this template.
- If under 10 MB (total), you can include your data files directly in your project folder. Otherwise you should host it on Google Drive, Dropbox, or some other cloud-based source accessible to all (current and future) students. Regardless of how your project's data files are hosted, your notebook should include code for downloading and importing the data (so from a user's perspective it shouldn't matter where the data files are hosted).