-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/recurrent and multiple trainer MAPPO #326
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much @DriesSmit 👐 This is really great! Just see my minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DriesSmit! 👍 Did a quick smoke review. Left minor comments. Happy with the benchmarking on this and will be useful for comparing with Jax systems.
examples/tf/debugging/simple_spread/recurrent/state_based/run_mappo.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DriesSmit , great work on getting this in 🔥 👐 🔥
Just a minor commont on adding doc strings for fix_sampler
.
60eb054
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Dries! 🔥
What?
Add recurrent training capabilities to MAPPO. Migrate MAPPO to now use the centralised variable source. This allows for training using multiple trainers.
Why?
This update allows for the use of MAPPO in the recurrent training setting and with multiple trainers.
How?
Adapted the MAPPO trainer to work in the recurrent setting. Added a recurrent executor to MAPPO. Migrated the old MAPPO code to use the new scaling version of Mava.
Extra
.