Support PyTorch 2.0.0 #559

justheuristic · 2023-03-16T12:32:59Z

Fix LRSchedulerBase
Handle None after .zero_grad() in torch 2.0.0
Use set_to_none=True by default in torch>=2.0
Add set_to_none param to TrainingStateAverager.step()

PyTorch 2.0 is not supported yet, it's support will be added in #559 (there are multiple issues to resolve). Until then, we need to require `torch<2.0.0` (otherwise 2.0.0 is installed, so CI doesn't work right now). This PR also adds Python 3.10 to the CI.

borzunov · 2023-03-28T19:05:51Z

I've rebased this PR to the master branch after #558 was merged.

borzunov · 2023-03-28T19:44:21Z

requirements.txt

@@ -1,5 +1,5 @@
 PyYAML
-torch>=1.9.0,<2.0.0
+torch>=1.9.0


We'll have torch 1.13.0 on Python 3.7, torch 2.0.0 on Python 3.8-3.10. This allows us to test both torch 1.13.0 and 2.0.0.

We can drop Python 3.7 and PyTorch < 2.0 support when necessary.

borzunov · 2023-03-28T20:08:20Z

hivemind/optim/optimizer.py

-    def zero_grad(self, set_to_none: bool = False):
+    _SET_TO_NONE_DEFAULT = Version(torch.__version__).major >= 2
+
+    def zero_grad(self, set_to_none: bool = _SET_TO_NONE_DEFAULT):


So it's consistent with the torch-wide default.

justheuristic

LGTM

mryab · 2023-03-28T20:23:05Z

tests/test_optimizer.py

+    assert (model1.weight.grad is None or torch.all(model1.weight.grad == 0)) and (
+        model2.weight.grad is None or torch.all(model2.weight.grad == 0)
+    ), "zero grad did not trigger"


Can we also use Version(torch.__version__).major >= 2 here? (or at least import _SET_TO_NONE_DEFAULT)

This way, the test won't have a bug where we always set the grad to None regardless of the PyTorch version

codecov · 2023-03-28T21:45:00Z

Codecov Report

Merging #559 (367bd4e) into master (351abc5) will decrease coverage by 0.09%.
The diff coverage is 59.09%.

@@            Coverage Diff             @@
##           master     #559      +/-   ##
==========================================
- Coverage   76.01%   75.92%   -0.09%     
==========================================
  Files          81       81              
  Lines        7995     8008      +13     
==========================================
+ Hits         6077     6080       +3     
- Misses       1918     1928      +10

Impacted Files	Coverage Δ
hivemind/optim/optimizer.py	`69.10% <33.33%> (-0.48%)`	⬇️
hivemind/moe/server/module_backend.py	`87.17% <50.00%> (-2.16%)`	⬇️
hivemind/optim/state_averager.py	`85.44% <75.00%> (-0.65%)`	⬇️

... and 4 files with indirect coverage changes

PyTorch 2.0 is not supported yet, it's support will be added in #559 (there are multiple issues to resolve). Until then, we need to require `torch<2.0.0` (otherwise 2.0.0 is installed, so CI doesn't work right now). This PR also adds Python 3.10 to the CI. (cherry picked from commit 73186cb)

- Fix LRSchedulerBase - Handle None after .zero_grad() in torch 2.0.0 - Use set_to_none=True by default in torch>=2.0 - Add set_to_none param to TrainingStateAverager.step() Co-authored-by: Aleksandr Borzunov <[email protected]> (cherry picked from commit 98531ce)

borzunov mentioned this pull request Mar 28, 2023

Require torch<2.0 until 2.0 is supported, add Python 3.10 to CI #558

Merged

test compatilility with torch 2.0

dfbb651

borzunov force-pushed the test-torch2.0 branch from 3d988db to dfbb651 Compare March 28, 2023 19:05

Fix LRSchedulerBase

5c32233

borzunov changed the title ~~test compatilility with torch 2.0~~ Support torch 2.0.0 Mar 28, 2023

borzunov reviewed Mar 28, 2023

View reviewed changes

borzunov force-pushed the test-torch2.0 branch from abc8c1c to e3adb21 Compare March 28, 2023 19:52

borzunov added 2 commits March 28, 2023 20:06

Handle None after .zero_grad() in torch 2.0.0

2d30bfd

Use set_to_none=True by default in torch>=2.0

7dbbef3

borzunov force-pushed the test-torch2.0 branch from e3adb21 to 7dbbef3 Compare March 28, 2023 20:07

borzunov reviewed Mar 28, 2023

View reviewed changes

borzunov changed the title ~~Support torch 2.0.0~~ Support PyTorch 2.0.0 Mar 28, 2023

justheuristic commented Mar 28, 2023

View reviewed changes

borzunov requested a review from mryab March 28, 2023 20:23

mryab reviewed Mar 28, 2023

View reviewed changes

test_optimizer: Make asserts consider _SET_TO_NONE_DEFAULT

ad774b6

mryab approved these changes Mar 28, 2023

View reviewed changes

Add set_to_none param to TrainingStateAverager.step()

367bd4e

borzunov force-pushed the test-torch2.0 branch from 26eaeac to 367bd4e Compare March 28, 2023 21:30

borzunov merged commit 98531ce into master Mar 28, 2023

borzunov deleted the test-torch2.0 branch March 28, 2023 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support PyTorch 2.0.0 #559

Support PyTorch 2.0.0 #559

justheuristic commented Mar 16, 2023 •

edited by borzunov

Loading

borzunov commented Mar 28, 2023

borzunov Mar 28, 2023 •

edited

Loading

borzunov Mar 28, 2023 •

edited

Loading

justheuristic left a comment

mryab Mar 28, 2023

codecov bot commented Mar 28, 2023

Support PyTorch 2.0.0 #559

Support PyTorch 2.0.0 #559

Conversation

justheuristic commented Mar 16, 2023 • edited by borzunov Loading

borzunov commented Mar 28, 2023

borzunov Mar 28, 2023 • edited Loading

Choose a reason for hiding this comment

borzunov Mar 28, 2023 • edited Loading

Choose a reason for hiding this comment

justheuristic left a comment

Choose a reason for hiding this comment

mryab Mar 28, 2023

Choose a reason for hiding this comment

codecov bot commented Mar 28, 2023

Codecov Report

justheuristic commented Mar 16, 2023 •

edited by borzunov

Loading

borzunov Mar 28, 2023 •

edited

Loading

borzunov Mar 28, 2023 •

edited

Loading