-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/mava reproducibility and PZ wrapper fix #296
Feature/mava reproducibility and PZ wrapper fix #296
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @KaleabTessera 🔥 The changes look great. Just seem my few comments. I really like all the small code cleanups you did 😄
"critics": critic_networks, | ||
"observations": observation_networks, | ||
} | ||
return make_default_networks_maddpg( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we merge MADDPG and MAD4PG's networks? It does seem like a great idea as much of the code is shared 👍 Thanks @KaleabTessera 😄 It might be worth it to just check if both still work as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will run some benchmarks before merging this in 👍
mava/systems/tf/maddpg/networks.py
Outdated
# The multiplexer concatenates the observations/actions. | ||
networks.CriticMultiplexer(), | ||
networks.LayerNormMLP( | ||
list(critic_networks_layer_sizes[key]) + [1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the + [1] be here in the case of MAD4PG? Probably not right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combining the networks for MADDPG and MAD4PG looks great thanks @KaleabTessera 🙌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, that is a bug, thanks @DriesSmit !
from acme.tf.networks.continuous import ResidualLayernormWrapper | ||
|
||
|
||
def get_initialization(seed: Optional[int] = None) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What makes all these new components different from Acme? Is it to be able to use a seed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. For most of this it is the seed and the networks are more configurable, i.e. layernorm and activations are configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @KaleabTessera! 💪 Looks great!
Happy to merge once we confirm the reproducibility of runs. 😄
def __init__( | ||
self, output_size: int, scale: float = 1e-4, seed: Optional[int] = None | ||
): | ||
"""[summary] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol, fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tests and confirmation runs @KaleabTessera 🔥 This is great! 🥳
What?
Why?
How?
Extra