Skip to content

svjmtn/ml-major-project-g3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About the code
===============

All of our code is in Jupyter notebooks. Most of the dependencies are in requirements.txt

In addtion to these dependencies mentioned in requirements.txt some of our notebooks use plotly and graphviz
graphviz, in addtion to the python library(which is covered by requirements.txt), requires a system library to be installed. The details for this is available here: https://www.graphviz.org/download/
plotly requires a jupyterlab or jupyter notebook extension to be installed inorder to view the visualisations. Instructions for that can be found here: https://binnisb.github.io/blog/datascience/2020/04/02/Plotly-in-lab.html#Set-up-and-examples

Description of files in data/
==============================

backup : This folder contains a backup of the other files in the data directory, incase something gets overwritten/deleted by accident

OxCGRT_latest.csv, codebook.md, owid-covid-data.csv, owid-covid-codebook.csv : All four of these files are raw data files dowloaded straight from the internet. Their sources and contents are described in ./src/ml-major-proj-data-upload.ipynb NOTE: This data was fetched on 16th April 2021, the current data 

uk_combined.csv, uk_regionwise_ox.csv : These contain data specific to UK that is extracted from owid-covid-data.csv and OxCGRT_latest.csv and then merged together. This file are output by the notebook ./src/ml-major-proj-uk-data-upload.ipynb . uk_combined.csv contains national level data for the UK. uk_regionwise_ox.csv contains regionwise stringency measures data for England, Scotland, Wales and Northern Ireland separately.

uk_processed.csv : This contains preprocessed data that is specific to the UK. This generated by ./src/ml-major-proj-uk-preprocessing.ipynb using uk_combined.csv


Description of files in src/
============================

ml-major-proj-data-upload.ipynb : Downloads the raw data from the internet from 2 different sources. Produces ./data/OxCGRT_latest.csv, ./data/owid-covid-data.csv ./data/codebook.md ./data/owid-covid-codebook.csv . The wandb variant of this uploads the dowloaded files to W&B servers.

ml-major-proj-uk-data-upload.ipynb : Separates out UK specific data from the output files of the above notebook. Produces ./data/uk_combined.csv ./data/uk_regionwise_ox.csv . The wandb variant fetches the raw files from W&B servers and uploads the output files back to the W&B server.

ml-major-proj-uk-preprocessing.ipynb : Does some general preprocessing of the ./data/uk_combined.csv output by the above notebook. Produces ./data/uk_processed.csv . The wandb variant fetches the input files from W&B and uploads the output files back to the server.

ml-major-project-visualisations.ipynb : Does some EDA and contains a bunch of visualisations both specific to the UK and showing UK in comparison with the world. Uses ./data/uk_combined.csv and ./data/owid-covid-data.csv . The wandb variant does the same but fetches the input files from W&B servers.

ml-major-project-lstm-model.ipynb : Fits an LSTM model to the data from ./data/uk_processed.csv which tries to predict new deaths and does a what if analysis based on the change in lockdown measures. Also contains visualisations of the predictions. The wandb variant does the same, but fetches the input files from the server and also logs metrics to the server.

ml-major-project-decision-trees.ipynb : Fits a Decision Tree to the data from ./data/uk_processed.csv which tries to do the same as above. Contains visualisations of the predictions and of the decision tree.

wandb: This folder contains the wandb variants referred to above. The notebooks contain some wandb specific code hence cant be run without access to the W&B project.

./src/wandb/ml-major-proj-decision-trees-tuning.ipynb : Does a hyperparameter search for the decision tree and logs metrics for every possible fit to the W&B server.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published