Skip to content

tri-cods/tidy-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Tidy(ish) Data

In order to begin thinking about digital methods, scholars must first make the conceptual leap toward thinking about their research as data. How do we get at the data in our research and how do we make it useful and usable by machines? What are some of the promises (and perils) of reframing research as data? By the end of the session, we’ll be introduced to strategies and tools for taking very different kinds of information and creating well-formed data, data that can then be used for analysis or visualization.

By way of introduction to working with data, we are going to focus on a) conceptually how data is structured using the tidy data framework and b) practically speaking, how to make it useable by other humans and machines with the program OpenRefine.

In this session, we will:

  • install and become familiar with some of OpenRefine's features
  • import and export derivative datasets
  • sort, filter, and facet data
  • fix errors and inconsistencies at scale
  • split columns with multiple values
  • introduce regular expressions

Get Started >>>

Glossary >>>


Working with Data

Tidy vs Messy Data

Introducing OpenRefine

Exploring OpenRefine

Tidy vs Messy Data Part II

Transforming Columns

Exporting Data

Deduplicating Rows (Won't be covered in this session, but feel free to explore on your own)

Resources

OpenRefine Introductory Video Tutorials

Programming Historian's Cleaning Data with OpenRefine

Tidy Data

Glossary


Session Leaders: Anna Lacy and James Truitt

Creative Commons License

Digital Research Institute (DRI) Curriculum by Graduate Center Digital Initiatives is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on a work at https://github.com/DHRI-Curriculum. When sharing this material or derivative works, preserve this paragraph, changing only the title of the derivative work, or provide comparable attribution.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •