training.py: performs actual training with a standard linear regression model and feature selection.
data_prep.py: performs general data formatting
drug_models.csv: contain the output consisiting of drug name, loss (MSE) and an array of coefficients