-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/madqn #362
Fix/madqn #362
Conversation
Merge develop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work @jcformanek!!:1st_place_medal: :fire: Lets get this in :slightly_smiling_face:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved again 👌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff @jcformanek!
Thanks guys. Sorry that this was a "Big Bang PR". I will make sure PRs are smaller in future 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jcformanek , 🔥 🔥 🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @jcformanek! 🔥
What?
Did major refactoring to the MADQN code.
Implemented working system for value decomposition based algorithms such as VDN and QMIX.
Created a SMAC wrapper so that we can compare with pymarl since the PZ SMAC wrapper is functionally different.
Created environment wrappers that concat prev actions and agent ids to the observations.
Why?
Improved reliability and performance in the MADQN system.
VDN and QMIX were not working before.
SMAC wrapper, prev action wrapper and agent id wrapper let us compare better with pymarl.
How?
MADQN refactor.
New Value Decomposition system inherits from MADQN with additional, interchangeable
mixer
component.Extra
[Any extra information.]