Release May 2022 (#26)

* Adds Github actions workflow for automated tests * Implements LR schedulers (Closes #16 and #17 ) * Supports Python 3.7 (Closes #22 ) * Fix various bugs
cooper-org · May 6, 2022 · 5cd18c8 · 5cd18c8
1 parent f691bf7
commit 5cd18c8
Show file tree

Hide file tree

Showing 21 changed files with 300 additions and 69 deletions.
diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
@@ -13,7 +13,7 @@ jobs:
         strategy:
             matrix:
                 platform: [ubuntu-latest, macos-latest, windows-latest]
-                python-version: ['3.8', '3.9']
+                python-version: ["3.7", "3.8", "3.9"]
                 # No gpu workflow yet!: https://github.com/apache/singa/issues/802
 
         steps:
@@ -25,10 +25,9 @@ jobs:
 
             - name: Install dependencies
               run: |
-                python -m pip install --upgrade pip
-                python -m pip install tox tox-gh-actions
-
+                  python -m pip install --upgrade pip
+                  python -m pip install tox tox-gh-actions
             - name: Test with tox
               run: tox
               env:
-                PLATFORM: ${{ matrix.platform }}
+                  PLATFORM: ${{ matrix.platform }}
diff --git a/CITATION.cff b/CITATION.cff
@@ -0,0 +1,10 @@
+cff-version: 1.2.0
+message: "If you use this software, please consider citing it as indicated below."
+authors:
+    - family-names: "Gallego-Posada"
+      given-names: "Jose"
+    - family-names: "Ramirez"
+      given-names: "Juan"
+title: "Cooper: a toolkit for Lagrangian-based constrained optimization"
+date-released: 2022-03-15
+url: "https://github.com/cooper-org/cooper"
diff --git a/README.md b/README.md
@@ -11,14 +11,13 @@
 This library aims to encourage and facilitate the study of constrained
 optimization problems in machine learning.
 
-
 **Cooper** is (almost!) seamlessly integrated with Pytorch and preserves the
 usual `loss -> backward -> step` workflow. If you are already familiar with
 Pytorch, using **Cooper** will be a breeze! 🙂
 
 **Cooper** was born out of the need to handle constrained optimization problems
 for which the loss or constraints are not necessarily "nicely behaved"
-or  "theoretically tractable", e.g. when no (efficient) projection or proximal
+or "theoretically tractable", e.g. when no (efficient) projection or proximal
 are available. Although assumptions of this kind have enabled the development of
 great Pytorch-based libraries such as [CHOP](https://github.com/openopt/chop)
 and [GeoTorch](https://github.com/Lezcano/geotorch), they are seldom satisfied
@@ -35,7 +34,7 @@ compatibility. ⚠️
 ## Getting Started
 
 Here we consider a simple convex optimization problem to illustrate how to use
- **Cooper**. This example is inspired by [this StackExchange question](https://datascience.stackexchange.com/questions/107366/how-do-you-solve-strictly-constrained-optimization-problems-with-pytorch):
+**Cooper**. This example is inspired by [this StackExchange question](https://datascience.stackexchange.com/questions/107366/how-do-you-solve-strictly-constrained-optimization-problems-with-pytorch):
 
 > _I am trying to solve the following problem using Pytorch: given a 6-sided die
 > whose average roll is known to be 4.5, what is the maximum entropy
@@ -76,7 +75,7 @@ primal_optimizer = cooper.optim.ExtraSGD([probs], lr=3e-2, momentum=0.7)
 
 # Define the dual optimizer. Note that this optimizer has NOT been fully instantiated
 # yet. Cooper takes care of this, once it has initialized the formulation state.
-dual_optimizer = cooper.optim.partial(cooper.optim.ExtraSGD, lr=9e-3, momentum=0.7)
+dual_optimizer = cooper.optim.partial_optimizer(cooper.optim.ExtraSGD, lr=9e-3, momentum=0.7)
 
 # Wrap the formulation and both optimizers inside a ConstrainedOptimizer
 coop = cooper.ConstrainedOptimizer(formulation, primal_optimizer, dual_optimizer)
@@ -91,6 +90,7 @@ for iter_num in range(5000):
 ```
 
 ## Installation
+
 ### Basic Installation
 
 ```bash
@@ -103,13 +103,12 @@ First, clone the [repository](https://github.com/cooper-org/cooper), navigate
 to the **Cooper** root directory and install the package in development mode by running:
 
 | Setting     | Command                                  | Notes                                     |
-|-------------|------------------------------------------|-------------------------------------------|
+| ----------- | ---------------------------------------- | ----------------------------------------- |
 | Development | `pip install --editable ".[dev, tests]"` | Editable mode. Matches test environment.  |
 | Docs        | `pip install --editable ".[docs]"`       | Used to re-generate the documentation.    |
 | Tutorials   | `pip install --editable ".[examples]"`   | Install dependencies for running examples |
 | No Tests    | `pip install --editable .`               | Editable mode, without tests.             |
 
-
 ## Package structure
 
 -   `cooper` - base package
@@ -118,8 +117,8 @@ to the **Cooper** root directory and install the package in development mode by
     -   `lagrangian_formulation` - Lagrangian formulation of a CMP
     -   `multipliers` - utility class for Lagrange multipliers
     -   `optim` - aliases for Pytorch optimizers and [extra-gradient versions](https://github.com/GauthierGidel/Variational-Inequality-GAN/blob/master/optim/extragradient.py) of SGD and Adam
-- `tests` - unit tests for `cooper` components
-- `tutorials` - source code for examples contained in the tutorial gallery
+-   `tests` - unit tests for `cooper` components
+-   `tutorials` - source code for examples contained in the tutorial gallery
 
 ## Contributions
 

diff --git a/cooper/__init__.py b/cooper/__init__.py
@@ -1,6 +1,11 @@
 """Top-level package for Constrained Optimization in Pytorch."""
 
-from importlib.metadata import PackageNotFoundError, version
+import sys
+
+if sys.version_info >= (3, 8):
+    from importlib.metadata import PackageNotFoundError, version
+else:
+    from importlib_metadata import PackageNotFoundError, version
 
 try:
     __version__ = version("cooper")

diff --git a/cooper/constrained_optimizer.py b/cooper/constrained_optimizer.py
@@ -42,6 +42,12 @@ class ConstrainedOptimizer:
             Defaults to None.
             When dealing with an unconstrained problem, should be set to None.
 
+        dual_scheduler: Partially instantiated
+            ``torch.optim.lr_scheduler._LRScheduler``
+            used to schedule the learning rate of the dual variables.
+            Defaults to None.
+            When dealing with an unconstrained problem, should be set to None.
+
         alternating: Whether to alternate parameter updates between primal and
             dual parameters. Otherwise, do simultaneous parameter updates.
             Defaults to False.
@@ -58,13 +64,15 @@ def __init__(
         formulation: Formulation,
         primal_optimizer: torch.optim.Optimizer,
         dual_optimizer: Optional[torch.optim.Optimizer] = None,
+        dual_scheduler: Optional[torch.optim.lr_scheduler._LRScheduler] = None,
         alternating: bool = False,
         dual_restarts: bool = False,
     ):
         self.formulation = formulation
         self.cmp = self.formulation.cmp
         self.primal_optimizer = primal_optimizer
         self.dual_optimizer = dual_optimizer
+        self.dual_scheduler = dual_scheduler
 
         self.alternating = alternating
         self.dual_restarts = dual_restarts
@@ -86,6 +94,13 @@ def sanity_checks(self):
             RuntimeError: a ``dual_optimizer`` was provided but the
                 ``ConstrainedMinimizationProblem`` of formulation was
                 unconstrained. There are no dual variables to optimize.
+            RuntimeError: a ``dual_scheduler`` was provided but the
+                ``ConstrainedMinimizationProblem`` of formulation was
+                unconstrained. There are no dual variables and no
+                ``dual_optimizer`` for learning rate scheduling.
+            RuntimeError: a ``dual_scheduler`` was provided but no
+                ``dual_optimizer`` was provided. Can not schedule the learning
+                rate of an unknown optimizer.
             RuntimeError: the considered ``ConstrainedMinimizationProblem`` is
                 unconstrained, but the provided ``primal_optimizer`` has an
                 ``extrapolation`` function. This is not supported because of
@@ -125,6 +140,19 @@ def sanity_checks(self):
                 be unconstrained."""
             )
 
+        if self.dual_scheduler is not None:
+            if not (self.cmp.is_constrained):
+                raise RuntimeError(
+                    """A dual scheduler was provided, but the `Problem` class
+                    claims to be unconstrained."""
+                )
+
+            if self.dual_optimizer is None:
+                raise RuntimeError(
+                    """A dual scheduler was provided, but no dual optimizer
+                    was provided."""
+                )
+
         if not (self.cmp.is_constrained) and self.is_extrapolation:
             raise RuntimeError(
                 """Using an extrapolating optimizer an unconstrained problem
@@ -148,7 +176,8 @@ def step(
     ):
         """
         Performs a single optimization step on both the primal and dual
-        variables.
+        variables. If ``dual_scheduler`` is provided, a scheduler step is
+        performed on the learning rate of the ``dual_optimizer``.
 
         Args:
             closure: Closure ``Callable`` required for re-evaluating the
@@ -168,6 +197,11 @@ def step(
             # Checks if needed and instantiates dual_optimizer
             self.dual_optimizer = self.dual_optimizer(self.formulation.dual_parameters)
 
+            if self.dual_scheduler is not None:
+                assert callable(self.dual_scheduler), "dual_scheduler must be callable"
+                # Instantiates the dual_scheduler
+                self.dual_scheduler = self.dual_scheduler(self.dual_optimizer)
+
         if self.is_extrapolation or self.alternating:
             assert closure is not None
 
@@ -198,6 +232,13 @@ def step(
             if self.cmp.is_constrained:
                 self.dual_step()
 
+                if self.dual_scheduler is not None:
+                    # Do a step on the dual scheduler after the actual step on
+                    # the dual parameters. Intermediate updates that take
+                    # place inside the extrapolation process do not perform a
+                    # call to the scheduler's step method
+                    self.dual_scheduler.step()
+
         else:
 
             self.primal_optimizer.step()
@@ -228,6 +269,9 @@ def step(
                     )
 
                 self.dual_step()
+
+                if self.dual_scheduler is not None:
+                    self.dual_scheduler.step()
 
     def dual_step(self, call_extrapolation=False):
 

diff --git a/cooper/lagrangian_formulation.py b/cooper/lagrangian_formulation.py
@@ -165,7 +165,7 @@ def weighted_violation(
         if not has_defect:
             # We should always have at least the regular defects, if not, then
             # the problem instance does not have `constraint_type` constraints
-            proxy_violation = torch.tensor([0.0])
+            proxy_violation = torch.tensor([0.0], device=cmp_state.loss.device)
         else:
             multipliers = getattr(self, constraint_type + "_multipliers")()
 

diff --git a/cooper/optim.py b/cooper/optim.py
@@ -1,20 +1,17 @@
-"""(Extrapolation) Optimizer aliases"""
+"""Extrapolation Optimizers and functions for partial instantiation of dual
+optimizers and schedulers"""
+
 import functools
 import math
 from collections.abc import Iterable
 from typing import Callable, List, Tuple, Type, no_type_check
 
 import torch
-
-# Define aliases
-SGD = torch.optim.SGD
-Adam = torch.optim.Adam
-Adagrad = torch.optim.Adagrad
-RMSprop = torch.optim.RMSprop
+from torch.optim.lr_scheduler import _LRScheduler
 
 
 @no_type_check
-def partial(optim_cls: Type[torch.optim.Optimizer], **optim_kwargs):
+def partial_optimizer(optim_cls: Type[torch.optim.Optimizer], **optim_kwargs):
     """
     Partially instantiates an optimizer class. This approach is preferred over
     :py:func:`functools.partial` since the returned value is an optimizer
@@ -32,6 +29,25 @@ class PartialOptimizer(optim_cls):
     return PartialOptimizer
 
 
+@no_type_check
+def partial_scheduler(scheduler_cls: Type[_LRScheduler], **scheduler_kwargs):
+    """
+    Partially instantiates a learning rate scheduler class. This approach is
+    preferred over :py:func:`functools.partial` since the returned value is a
+    scheduler class whose attributes can be inspected and which can be further
+    instantiated.
+
+    Args:
+        scheduler_cls: Pytorch scheduler class to be partially instantiated.
+        **scheduler_kwargs: Keyword arguments for scheduler hyperparemeters.
+    """
+
+    class PartialScheduler(scheduler_cls):
+        __init__ = functools.partialmethod(scheduler_cls.__init__, **scheduler_kwargs)
+
+    return PartialScheduler
+
+
 # -----------------------------------------------------------------------------
 # Implementation of ExtraOptimizers contains minor edits on source code from:
 # https://github.com/GauthierGidel/Variational-Inequality-GAN/blob/master/optim/extragradient.py
@@ -197,7 +213,7 @@ def __init__(
         super(ExtraSGD, self).__init__(params, defaults)
 
     def __setstate__(self, state):
-        super(SGD, self).__setstate__(state)
+        super(torch.optim.SGD, self).__setstate__(state)
         for group in self.param_groups:
             group.setdefault("nesterov", False)
 

diff --git a/docs/source/constrained_optimizer.rst b/docs/source/constrained_optimizer.rst
@@ -63,7 +63,7 @@ the definition of a CMP can be found under the entry for :ref:`cmp`.
         cmp = cooper.ConstrainedMinimizationProblem(is_constrained=False)
         formulation = cooper.problem.Formulation(...)
 
-        primal_optimizer = cooper.optim.Adam(model.parameters(), lr=1e-2)
+        primal_optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)
 
         constrained_optimizer = cooper.ConstrainedOptimizer(
             formulation=formulation,
@@ -80,9 +80,9 @@ the definition of a CMP can be found under the entry for :ref:`cmp`.
         cmp = cooper.ConstrainedMinimizationProblem(is_constrained=True)
         formulation = cooper.problem.Formulation(...)
 
-        primal_optimizer = cooper.optim.Adam(model.parameters(), lr=1e-2)
+        primal_optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)
         # Note that dual_optimizer is "partly instantiated", *without* parameters
-        dual_optimizer = cooper.optim.partial(cooper.optim.SGD, lr=1e-3, momentum=0.9)
+        dual_optimizer = cooper.optim.partial_optimizer(torch.optim.SGD, lr=1e-3, momentum=0.9)
 
         constrained_optimizer = cooper.ConstrainedOptimizer(
             formulation=formulation,
@@ -122,9 +122,9 @@ Example
         cmp = cooper.ConstrainedMinimizationProblem(...)
         formulation = cooper.LagrangianFormulation(...)
 
-        primal_optimizer = cooper.optim.SGD(model.parameters(), lr=primal_lr)
+        primal_optimizer = torch.optim.SGD(model.parameters(), lr=primal_lr)
         # Note that dual_optimizer is "partly instantiated", *without* parameters
-        dual_optimizer = cooper.optim.partial(cooper.optim.SGD, lr=primal_lr)
+        dual_optimizer = cooper.optim.partial_optimizer(torch.optim.SGD, lr=primal_lr)
 
         constrained_optimizer = cooper.ConstrainedOptimizer(
             formulation=formulation,