The project contains three files:
- run_analysis.R: R code that obtains the data and performs the necessary actions to generate a tidy data set.
- README.md: File to explain how to use the R code.
- Codebook.md: Explains the operations performed on the data and the various data frames that are created.
The files are organized as follows:
- run_analysis.R: Present in the main/ directory.
- README.md: Present in the root directory.
- Codebook.md: Present in the main/ directory.
run_analysis.R has the following requirements:
- The raw data set (getdata-projectfiles-UCI HAR Dataset.zip) should be downloaded and saved in the main/ directory. The raw data can be downloaded from here.
- The run_analysis.R script requires the reshape2 package to be installed in RStudio.
run_analysis.R defines the run_analysis() function. This function does not take any arguements. It only assumes the presence of a zipped raw dataset in the same folder.
The main/ directory must be set as the working directory in RStudio. The raw dataset must be present in the main/ directory as described in the requirements. The script can be run by calling the function from the R console as follows:
run_analysis()
- The code produces several print statements to indicate the progress of the function.
- A folder called "working" is created in the main directory. It contains:
- A final tidy dataset titled "tidy_data.txt". Please refer Codebook.md for details regarding the tidy dataset.
- A folder titled "UCI HAR Dataset" containing the unzipped raw dataset.
- The tidy data frame (named average_values_df in the script) is also returned by the function.