Feature/eval intervals #323

KaleabTessera · 2021-10-18T12:21:05Z

What?

Added eval interval so that evaluation loops can run at certain intervals.
Minor changes for new version of mypy.

Why?

To be able to schedule evaluator runs.

How?

Added extra condition to environment loop - to check if interval exists.

Extra

@DriesSmit @arnupretorius Most of the file changes were mypy changes (for new version of mypy) so you can focus on looking the environment loop code.
Also note this is a PR into mava-scaling and not into develop.
I ran all test and checks locally and they pass 👍

…re/eval-intervals

… mypy and numpy versions.

arnupretorius

Thanks @KaleabTessera! Going to be super useful for comparisons. 🔥

Just left a few minor suggestions.

arnupretorius · 2021-11-18T12:28:11Z

mava/environment_loop.py

+            counts = self._counter.get_counts()
+        return counts
+
+    def record_counts(self, episode_steps: int) -> Any:


I think we should make the function return type int.

So I updated this. It returns a counting.Counter object.

arnupretorius · 2021-11-18T12:28:27Z

mava/environment_loop.py

@@ -338,6 +348,36 @@ def _compute_episode_statistics(
    ) -> None:
        pass

+    def get_counts(self) -> Any:


Same as comment below.

Updated to counting.Counter.

arnupretorius · 2021-11-18T14:10:29Z

mava/environment_loop.py

@@ -338,6 +348,36 @@ def _compute_episode_statistics(
    ) -> None:
        pass

+    def get_counts(self) -> Any:
+        if hasattr(self._executor, "_counts"):


I didn't see this being used anywhere? I.e. executor examples that have this attribute. What is the use case for this?

So this is for systems that haven't been moved to mava scaling yet. These systems use _counts .

mava/systems/tf/maddpg/system.py

mava/environment_loop.py

DriesSmit · 2021-11-19T10:48:04Z

mava/environment_loop.py

+
+            # We need to get the latest counts if we are using eval intervals.
+            if environment_loop_schedule:
+                self._executor.update()


Is this update not already performed in the run_episode call? Why are we doing it again here? If we want to force an update (for evironment_loop_schedule == True) we need to change the variable client update rate for the evaluator, right?

So this update is important to get the latest counts.

Since these loops run on different processes, while the evaluator is waiting, it doesn't run the loop and so it vars counts don't get updated (this part doesn't run).

So while the evaluator is waiting, it still need to call self._executor.update() to get latest counts from the executor runs.

Not sure if that makes sense?

I see. That makes sense yes. Should there not maybe be a time.sleep somewhere in here? This is so that the evaluator does not constantly speak to the variable_server while waiting. And also self._executor.update() only updates the executor every 1000 steps right? So we maybe need to change that setting to 1 in the case of using an evaluator that has waiting intervals?

DriesSmit

Thanks @KaleabTessera the changes looks great 🚀 Just see my few comments.

KaleabTessera · 2021-11-22T17:27:35Z

From my side this can be merged in. The issues with the github actions/ package versions are resolved in this PR - #310.

…ture/eval-intervals

DriesSmit · 2021-11-30T09:23:58Z

mava/utils/training_utils.py

 from typing import Any, Dict, Iterable, List, Optional, Sequence, Tuple, Union

 import sonnet as snt
 import tensorflow as tf
 import trfl


+def non_blocking_sleep(time_in_seconds: int) -> None:


Nice. I like this.

mava/environment_loop.py

Co-authored-by: Dries <[email protected]>

DriesSmit

Great work @KaleabTessera 🚀

KaleabTessera added 10 commits October 12, 2021 13:32

feat: Added eval scheduling for single executors.

6f1bb89

Merge remote-tracking branch 'origin/feature/mava-scaling' into featu…

f4887bb

…re/eval-intervals

Fix: Updated imports.

b6745aa

feat: Eval schedule for multiple executors.

942a546

feat: Added eval scheduling to all systems and updated typing for new…

37d04a8

… mypy and numpy versions.

fix: Updated PZ wrapper.

1a100ce

fix: Updated tests.

58e2795

fix: Updated systems to pass eval var.

6bf637f

feat: Added eval interval example.

362f1dd

fix: Passed interval and evaluator vars to recurrent madqn.

91931e6

KaleabTessera requested review from arnupretorius and DriesSmit as code owners October 18, 2021 12:21

fix: Passed interval and evaluator vars to diff executors.

568b439

DriesSmit assigned DriesSmit and unassigned DriesSmit Oct 21, 2021

Base automatically changed from feature/mava-scaling to develop October 25, 2021 09:43

DriesSmit assigned DriesSmit and KaleabTessera and unassigned DriesSmit Oct 25, 2021

DriesSmit added the enhancement New feature or request label Oct 25, 2021

KaleabTessera and others added 4 commits November 3, 2021 09:28

Merge branch 'develop' into feature/eval-intervals

d85fe4b

fix: Minor mypy fixes.

76a9095

chore: Updated example.

c393002

Merge branch 'develop' into feature/eval-intervals

52cd450

arnupretorius previously approved these changes Nov 18, 2021

View reviewed changes

DriesSmit reviewed Nov 19, 2021

View reviewed changes

mava/systems/tf/maddpg/system.py Show resolved Hide resolved

DriesSmit reviewed Nov 19, 2021

View reviewed changes

mava/environment_loop.py Show resolved Hide resolved

DriesSmit reviewed Nov 19, 2021

View reviewed changes

chore: Updated return types of vars.

40fb8ae

KaleabTessera dismissed arnupretorius’s stale review via 40fb8ae November 22, 2021 15:45

KaleabTessera added 9 commits November 22, 2021 17:54

ci: Switched to github runtime.

1f27905

ci: Added flatland fix to ci.

06797b2

ci: Tmp pyparsing fix.

c779429

ci: Tmp flatland fix.

67e0ce9

fix: Updated package versions.

f86e09a

fix: Specify lp verison.

dbec516

ci: Removed build for py3.6.

c07391b

chore: Meltingpot wrapper mypy fixes.

db0950e

chore: More meltingpot wrapper mypy fixes.

ef42512

KaleabTessera added 3 commits November 22, 2021 19:40

ci: Updated ci mypy.

9b9ca90

feat: Set executor update period to 0, when using eval intervals.

643b9c6

Merge remote-tracking branch 'origin/feature/eval-intervals' into fea…

eba5cde

…ture/eval-intervals