A project for the Machine Learning Operations course based around a Super Resolution model.
Use make help
to see how to run important features with descriptions.
├── LICENSE
├── Makefile <- Makefile with commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── azure <- Contains scipts for deploying/training models using Microsoft Azure.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── docs <- A default Sphinx project; see sphinx-doc.org for details. CURRENTLY NOT IN USE.
│
├── models <- Trained and serialized models, model predictions, or model summaries.
│
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting.
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`.
├── requirements_test.txt <- The requirements file for running the tests.
├── setup.py <- Makes project pip installable (pip install -e .) so src can be imported
├── src <- Source code for use in this project.
│ ├── __init__.py <- Makes src a Python module
│ │
│ ├── data <- Scripts to download or generate data
│ ├── hparams <- .yaml files for hyperparameter configuration using Hydra.
│ ├── models <- Scripts to train models and then use trained models to make
│ predictions
│── tests <- Test scripts using pytest.
The following checklist gives a good sense of what is included in the project:
-
Create a git repository -
All members have write access to repository -
Using dedicated environment to keep track of packages -
File structure made using cookiecutter -
make_dataset.py filled to download needed data -
Add a model file and a training script and get that running -
Done profiling and optimized code -
requirements.txt filled with used dependencies -
Write unit tests for some part of the codebase and get code coverage -
Get some continues integration running on the github repository -
use either tensorboard or wandb to log training progress and other important metrics/artifacts in your code -
remember to comply with good coding practices while doing the project
-
Setup and used Azure to train your model -
Played around with distributed data loading -
(not curriculum) Reformated your code in the pytorch lightning format -
Deployed your model using Azure -
Checked how robust your model is towards data drifting -
Deployed your model locally using TorchServe
-
Used Optuna to run hyperparameter optimization on your model -
Wrote one or multiple configurations files for your experiments -
Used Hydra to load the configurations and manage your hyperparameters
-
Revisit your initial project description. Did the project turn out as you wanted? -
Make sure all group members have a understanding about all parts of the project -
Created a powerpoint presentation explaining your project -
Uploaded all your code to github
Project based on the cookiecutter data science project template. #cookiecutterdatascience