This repository contains my final project (in collaboration with Joseph Doraid '23) for STAT 139: Introduction to Linear Models. This course was taken during my time as undergraduate student at Harvard during the Fall '22 semester. The project centers around investigating the statistical relationship between teams' on-court performance in the NBA and their future revenue when controlling for other factors.
A full writeup of the project can be found in "writeup.pdf"
The code used to investigate and model the relationship between revenue and performance can be found in "code.Rmd"
The raw data used in this project can be found in the "raw_data" folder.
The code used to clean the raw data can be found in "clean_data.ipynb"
The cleaned dataset used in this project can be found in "cleaned.csv"