Merge pull request #209 from SebChw/develop

Release
SebChw · Feb 5, 2024 · df48f6f · df48f6f
2 parents 6f7af56 + db5a3e7
commit df48f6f
Show file tree

Hide file tree

Showing 20 changed files with 882 additions and 221 deletions.
diff --git a/README.md b/README.md
@@ -1,73 +1,181 @@
 <p align="center"><img src="docs/art.png" alt="image" width="200" height="auto"></p>
 
-# ART - Actually Robust Training framework
+# ART - Actually Robust Training Framework
 
 ![Tests](https://github.com/SebChw/art/actions/workflows/tests.yml/badge.svg)
 ![Docs](
 https://readthedocs.org/projects/actually-robust-training/badge/?version=latest&style=flat)
 
 ----
 
-**ART** is a framework that teaches and keeps an eye on good practices when training deep neural networks. It is inspired by a [blog post by Andrej Karpathy “A Recipe for Training Neural Networks”](https://karpathy.github.io/2019/04/25/recipe/). The framework teaches the user how to properly train DNNs by encouraging the user to use built-in mechanisms that ensure the correctness and robustness of the pipeline using easily usable steps. It allows users not only to learn but also to use it in their future projects to speed up model development.
+**ART** is a Python library that teaches good practices when training deep neural networks with [PyTorch](https://pytorch.org/). It is inspired by Andrej Karpathy's blog post [“A Recipe for Training Neural Networks”](https://karpathy.github.io/2019/04/25/recipe/). ART encourages the user to train DNNs through a series of steps that ensure the correctness and robustness of their training pipeline. The steps implemented using ART can be viewed not only as guidance for early adepts of deep learning but also as a project template and checklist for more advanced users.
+
+To get the most out of ART, you should have a basic knowledge of (or eagerness to learn):
+- Python: https://www.learnpython.org/
+- Machine learning: https://www.coursera.org/learn/machine-learning
+- PyTorch: https://pytorch.org/tutorials/
+- PyTorch Lightning: https://lightning.ai/docs/pytorch/stable/levels/core_skills.html
 
 **Table of contents:**
 - [ART - Actually Robust Training framework](#art---actually-robust-training-framework)
   - [Installation](#installation)
+  - [Quickstart](#quickstart)
   - [Project creation](#project-creation)
   - [Dashboard](#dashboard)
   - [Tutorials](#tutorials)
-  - [Required knowledge](#required-knowledge)
+  - [API Cheatsheet](#api-cheatsheet)
   - [Contributing](#contributing)
 
 ## Installation
 To get started, install ART package using:
 ```sh
 pip install art-training
 ```
+
+## Quickstart
+
+1. The basic idea behind ART is to split your deep learning pipeline into a series of _steps_. 
+2. Each step should be accompanied by a set of _checks_. ART will not move to the next step without passing checks from previous steps.
+
+```python
+import math
+import torch.nn as nn
+from torchmetrics import Accuracy
+from art.checks import CheckScoreCloseTo, CheckScoreGreaterThan, CheckScoreLessThan
+from art.metrics import SkippedMetric
+from art.project import ArtProject
+from art.steps import CheckLossOnInit, Overfit, OverfitOneBatch
+from art.utils.quickstart import ArtModuleExample, LightningDataModuleExample
+
+# Initialize the datamodule, and indicate the model class
+datamodule = LightningDataModuleExample()
+model_class = ArtModuleExample
+
+# Define metrics and loss functions to be calculated within the project
+metric = Accuracy(task="multiclass", num_classes=datamodule.n_classes)
+loss_fn = nn.CrossEntropyLoss()
+
+# Create an ART project and register defined metrics
+project = ArtProject(name="quickstart", datamodule=datamodule)
+project.register_metrics([metric, loss_fn])
+
+# Add steps to the project
+EXPECTED_LOSS = -math.log(1 / datamodule.n_classes)
+project.add_step(
+    CheckLossOnInit(model_class),
+    checks=[CheckScoreCloseTo(loss_fn, EXPECTED_LOSS, rel_tol=0.01)],
+    skipped_metrics=[SkippedMetric(metric)],
+)
+project.add_step(
+    OverfitOneBatch(model_class, number_of_steps=100),
+    checks=[CheckScoreLessThan(loss_fn, 0.1)],
+    skipped_metrics=[SkippedMetric(metric)],
+)
+project.add_step(
+    Overfit(model_class, max_epochs=10),
+    checks=[CheckScoreGreaterThan(metric, 0.9)],
+)
+
+# Run your project
+project.run_all()
+```
+
+As a result, you should observe something like this:
+```
+    Check failed for step: Overfit. Reason: Score 0.7900000214576721 is not greater than 0.9
+    Summary:
+    Step: Check Loss On Init, Model: ArtModuleExample, Passed: True. Results:
+            CrossEntropyLoss-validate: 2.299098491668701
+    Step: Overfit One Batch, Model: ArtModuleExample, Passed: True. Results:
+            CrossEntropyLoss-train: 0.03459629788994789
+    Step: Overfit, Model: ArtModuleExample, Passed: False. Results:
+            MulticlassAccuracy-train: 0.7900000214576721
+            CrossEntropyLoss-train: 0.5287203788757324
+            MulticlassAccuracy-validate: 0.699999988079071
+            CrossEntropyLoss-validate: 0.8762148022651672
+```
+
+Finally, track your progress with the dashboard:
+
+```sh
+python -m art.cli run-dashboard
+```
+
+<p align="center"><img src="docs/dashboard.png" alt="image"></p>
+
+In summary:
+- You still use **pure PyTorch and Lightning**.
+- You don't lose any **flexibility**.
+- You keep your experiments **organized**.
+- You follow **best practices**.
+- You make your model **easier to debug**.
+- You increase experiment **reproducibility**.
+
+If you want to use all features from ART and create your new Deep Learning Project following good practices check out the [tutorials](#tutorials).
+
 ## Project creation
-To use most of art's features we encourage you to create a new folder for your project using the CLI tool:
+To get the most out of ART, we encourage you to create a new folder for your project using the CLI tool:
 ```sh
 python -m art.cli create-project my_project_name
 ```
 
-This will create a new folder `my_project` with a basic structure for your project. To learn more about ART we encourage you to read our [documentation](https://actually-robust-training.readthedocs.io/en/latest/), and check our [tutorials](#tutorials)!
+This will create a new folder called `my_project_name` with a basic structure for your project. To learn more about ART, for more details we encourage you to read the [documentation](https://actually-robust-training.readthedocs.io/en/latest/) or go through the [tutorials](#tutorials)!
 
 ## Dashboard
-After you run some steps you can see compare their execution in the dashboard. To use the dashboard, firstly install required dependencies:
+After you run some steps, you can compare their execution in the dashboard. To use the dashboard, first install required dependencies:
 ```sh
 pip install art-training[dashboard]
 ```
-and run this command in the directory of your project (directory with folder called art_checkpoints).
+and run the following command in the directory of your project (the directory with a folder called art_checkpoints).
 ```sh
 python -m art.cli run-dashboard
 ```
-Optionally you can use --experiment-folder switch to pass path to the folder. For more info, use --help switch.
+Optionally you can use the `--experiment-folder` switch to pass the path to the folder. For more info, use the `--help` switch.
 
 ## Tutorials
-1. A showcase of ART's features. To check it out type:
+1. A showcase of ART's features. To check it out, type:
 ```sh
 python -m art.cli get-started
 ```
 and launch tutorial.ipynb
 
-After running all cells run dashboard with
-
+After running all cells run the dashboard with:
 ```sh
 python -m art.cli run-dashboard
 ```
 
 2. A tutorial showing how to use ART for transfer learning in an NLP task.
 ```sh
 python -m art.cli bert-transfer-learning-tutorial
+
+```
+3. A tutorial showing how to use ART for regularization
+```sh
+python -m art.cli regularization-tutorial
 ```
 
+## API Cheatsheet
+- [**ArtModule**](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.core.ArtModule): While exploring different models, ART provides a unified way to define and train models. ART uses `ArtModule` that inherits from PyTorch Lightning's LightningModule. ArtModules are designed to be easily configurable and to support different model architectures.
+- [**Step**](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.steps.Step): Unitary process that takes you closer to your final goal - good Deep Learning model. **In this tutorial we present you steps that were inspired by Andrej Karpathy's** [Recipe for Training Neural Networks](http://karpathy.github.io/2019/04/25/recipe/):
+    1. [EvaluateBaseline](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.steps.EvaluateBaseline) - before starting a new project it is good to know with whom we compete.
+    2. [CheckLossOnInit](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.steps.CheckLossOnInit) - Checking Loss right after network initialization is a very good debug step
+    3. [OverfitOneBatch](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.steps.OverfitOneBatch) - If you can't Overfit a single batch, it is very unlikely that your network will work any good.
+    4. [Overfit](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.steps.Overfit) - By reducing the wanted metric on the training set you can observe `theoretically` achievable minimum if this value doesn't satisfy you it is very unlikely it will be better on the test set.
+    5. [Regularize](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.steps.Regularize) - Usually gap between training and validation score is quite big and we need to introduce regularization techniques to achieve satisfactory validation accuracy.
+    6. [TransferLearning](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.steps.TransferLearning) - If you have a pre-trained model on a similar task, you can use it to initialize your network. In this step, you can perform two types of transfer learning:
+        - Freeze - Freeze all layers except the last one and train only the last layer.
+        - Finetune - Unfreeze all layers and train the whole network.
+- [**MetricCalculator**](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.metrics.MetricCalculator): To make you write less code we implemented `MetricCalculator`, a special object that takes care of metric calculation between all steps. The only thing you have to do is to `register` the metrics that you want to compute.
+- [**Check**](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.checks.Check): For every step you must pass a list of `Check` objects that must be fulfilled for the step to be passed. You may encounter checks like [`CheckScoreExists`](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.checks.CheckScoreExists) or [`CheckCloseTo`](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.checks.CheckScoreCloseTo). Every check takes at least 3 arguments:
+  - Metric, which value at the end of a step will be checked. This may be the `nn.CrossEntropyLoss` object.
+  - Stage, During different steps you'll be interested in performance at either Training or Validation. You must pass the information about this too.
+  - value, wanted the value of the check
+- [**Art Project**](https://actually-robust-training.readthedocs.io/en/latest/apidocs/art.html#art.project.ArtProject): Every project consists of many steps. You can add them to `ArtProject` which is responsible for running them. `ArtProject` also saves metadata about your steps that you can later see in `Dashboard`. 
+- **Dataset**: Each Project is supposed to deal with one dataset. ART supports every LightningDataModule from PyTorch Lightning and every torch Dataset with its dataloader.
+- **Dashboard**: For every project you can run a dashboard that will show you the progress of your project. You can run it with the `python -m art.cli run-dashboard` command. You can also run it with the `--experiment-folder` switch to pass the path to the folder. For more info, use the `--help` switch.
 
-## Required knowledge
-In order to use ART, you should have a basic knowledge of:
-- Python - you can find many tutorials online, e.g. [here](https://www.learnpython.org/)
-- Basic knowledge of machine learning & neural networks - you can find many tutorials online, e.g. [here](https://www.coursera.org/learn/machine-learning)
-- PyTorch - you can find many tutorials online, e.g. [here](https://pytorch.org/tutorials/)
-- PyTorch Lightning - you can find many tutorials online, e.g. [here](https://lightning.ai/docs/pytorch/stable/levels/core_skills.html)
 
 ## Contributing
-We welcome contributions to ART! Please check out our [contributing guide](https://github.com/SebChw/art/wiki/Contributing)
+We welcome contributions to ART! Please check out our [contributing guide](https://github.com/SebChw/art/wiki/Contributing)
+
+
diff --git a/art/checks.py b/art/checks.py
@@ -1,7 +1,7 @@
 import math
 from abc import ABC, abstractmethod
 from dataclasses import dataclass, field
-from typing import List
+from typing import Any, List, Union
 
 
 @dataclass
@@ -121,13 +121,13 @@ class CheckScore(CheckResult):
     Base class for checking scores based on a specific metric.
 
     Attributes:
-        metric: An object used to calculate the metric.
+        metric: An object used to calculate the metric or the string with the name of the metric.
         value (float): The expected value of the metric.
     """
 
     def __init__(
         self,
-        metric,  # This requires an object which was used to calculate metric
+        metric: Union[str, Any],
         value: float,
     ):
         self.metric = metric
@@ -158,7 +158,10 @@ def check(self, step) -> ResultOfCheck:
         """
         last_run = step.get_latest_run()
         result = last_run["scores"]
-        self.build_required_key(step, self.metric)
+        if isinstance(self.metric, str):
+            self.required_key = self.metric
+        else:
+            self.build_required_key(step, self.metric)
         return self._check_method(result)
 
 

diff --git a/art/cli/main.py b/art/cli/main.py
@@ -78,8 +78,8 @@ def create_project(
 
 @app.command()
 def get_started():
-    """Create a project named 'mnist_tutorial' using the 'mnist_tutorial_cookiecutter' branch."""
-    create_project(project_name="mnist_tutorial", branch="mnist_tutorial_cookiecutter")
+    """Create a project named 'mnist_tutorial' using the 'mnist_tutorial' branch."""
+    create_project(project_name="mnist_tutorial", branch="mnist_tutorial")
 
 
 @app.command()
@@ -97,5 +97,14 @@ def bert_transfer_learning_tutorial():
     )
 
 
+@app.command()
+def regularization_tutorial():
+    """Creates a regularize tutorial."""
+    create_project(
+        project_name="regularize_tutorial",
+        branch="regularize_tutorial",
+    )
+
+
 if __name__ == "__main__":
     app()
diff --git a/art/core.py b/art/core.py
@@ -8,7 +8,7 @@
 from torch.utils.data import DataLoader
 
 from art.metrics import MetricCalculator
-from art.utils.enums import LOSS, PREDICTION, TARGET
+from art.utils.enums import LOSS, PREDICTION, TARGET, TrainingStage
 
 
 class ArtModule(L.LightningModule, ABC):
@@ -17,7 +17,8 @@ def __init__(
     ):
         super().__init__()
         self.regularized = True
-        self.reset_pipelines()
+        self.set_pipelines()
+        self.stage: TrainingStage = TrainingStage.TRAIN
 
     """
     A module for managing the training process and application of various model configurations.
@@ -42,7 +43,7 @@ def check_setup(self):
         if not hasattr(self, "metric_calculator"):
             raise ValueError("You need to set metric calculator first!")
 
-    def reset_pipelines(self):
+    def set_pipelines(self):
         """
         Reset pipelines for training, validation, and testing.
         """
@@ -60,40 +61,6 @@ def reset_pipelines(self):
         ]
         self.ml_train_pipeline = [self.ml_parse_data, self.baseline_train]
 
-    def turn_on_model_regularizations(self):
-        """
-        Turn on model regularizations.
-        """
-        if not self.regularized:
-            for param in self.parameters():
-                name, obj = param
-                if isinstance(obj, torch.nn.Dropout):
-                    obj.p = self.unregularized_params[name]
-
-            self.configure_optimizers = self.original_configure_optimizers
-
-            self.regularized = True
-
-    def turn_off_model_reguralizations(self):
-        """
-        Turn off model regularizations.
-        """
-        if self.regularized:
-            self.unregularized_params = {}
-            for param in self.parameters():
-                name, obj = param
-                if isinstance(obj, torch.nn.Dropout):
-                    self.unregularized_params[name] = obj.p
-                    obj.p = 0
-
-            # Simple Adam, no fancy optimizers at this stage
-            self.original_configure_optimizers = self.configure_optimizers
-            self.configure_optimizers = lambda self: torch.optim.Adam(
-                self.parameters(), lr=3e-4
-            )
-
-            self.regularized = False
-
     def parse_data(self, data: Dict):
         """
         Parse data.
@@ -153,6 +120,7 @@ def validation_step(
             batch (Union[Dict[str, Any], DataLoader, torch.Tensor]): Batch to validate.
             batch_idx (int): Batch index.
         """
+        self.stage = TrainingStage.VALIDATION
         data = {"batch": batch, "batch_idx": batch_idx}
         for func in self.validation_step_pipeline:
             data = func(data)
@@ -170,6 +138,7 @@ def training_step(
         Returns:
             Dict: Data with loss.
         """
+        self.stage = TrainingStage.TRAIN
         data = {"batch": batch, "batch_idx": batch_idx}
         for func in self.train_step_pipeline:
             data = func(data)
@@ -186,6 +155,7 @@ def test_step(
             batch (Union[Dict[str, Any], DataLoader, torch.Tensor]): Batch to test.
             batch_idx (int): Batch index.
         """
+        self.stage = TrainingStage.TEST
         data = {"batch": batch, "batch_idx": batch_idx}
         for func in self.validation_step_pipeline:
             data = func(data)

diff --git a/art/dashboard/backend.py b/art/dashboard/backend.py
@@ -30,6 +30,11 @@ def prepare_steps_info(logs_path: Path) -> Dict[str, Dict]:
             step_name = step_info["name"]
             step_model = step_info["model"]
             for run in step_info["runs"]:
+                if "regularize" in run["parameters"]:
+                    run["parameters"]["regularize"] = stringify_regularize(
+                        run["parameters"]["regularize"]
+                    )
+
                 new_sample = {
                     "model": step_model,
                     **run["scores"],
@@ -53,6 +58,30 @@ def prepare_steps_info(logs_path: Path) -> Dict[str, Dict]:
     return steps_info
 
 
+def stringify_regularize(regularize: Dict) -> str:
+    """Since regularize field contain list we must handle them with special care        .
+
+    Args:
+        regularize (Dict): regularize field from results.json
+
+    Returns:
+        str: stringified version of regularize field
+    """
+    parameters = []
+    for key, value in regularize.items():
+        if key in ["model_modifiers", "datamodule_modifiers"]:
+            continue
+        parameters.append(f"{key}={value}")
+    representation = ""
+    if parameters:
+        representation += f"model-kwargs={' '.join(parameters)} |"
+    if regularize["model_modifiers"]:
+        representation += f"model-modifiers={regularize['model_modifiers']} |"
+    if regularize["datamodule_modifiers"]:
+        representation += f"datamodule-modifiers={regularize['datamodule_modifiers']}"
+    return representation
+
+
 def prepare_steps():
     return [
         "Data analysis",