Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: working version of importance sampling on feedforward madqn. #275

Merged
merged 3 commits into from
Jul 28, 2021

Conversation

jcformanek
Copy link
Contributor

@jcformanek jcformanek commented Jul 19, 2021

What?

Implemented importance sampling (IS) / prioritized experience replay for feedforward madqn.

Why?

Hopefully importance sampling will assist performance.

How?

After each learning step we compute new priorities for the samples we drew from the replay buffer. This is done using Q-value errors. The mutate_priorities() function is used to update the priorities in the reverb table.

Extra

For now IS only works in feedforward madqn. I have also tested to make sure all of the other systems that inherit from feedforward MADQN still work. Recurrent MADQN, MADQN with comms, Dial, VDN and Qmix all still work after I made these changes.

Because so many systems inherit from feedforward MADQN, it is quite hard to make changes without breaking the other systems. So my strategy mocing forward is going to be to make very incremental upgrades to MADQN and ensure at each step that nothing breaks the other systems.

@jcformanek jcformanek added the enhancement New feature or request label Jul 19, 2021
@jcformanek jcformanek requested a review from arnupretorius July 19, 2021 13:46
@jcformanek jcformanek self-assigned this Jul 19, 2021
Copy link
Contributor

@DriesSmit DriesSmit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great thanks @jcformanek 🔥 So this PR should go in after the new adder PR?

Copy link
Contributor

@KaleabTessera KaleabTessera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really great @jcformanek ! Thanks so much for this. The only thing left from my side is to benchmark this. After that, I am happy to approve.

@jcformanek
Copy link
Contributor Author

jcformanek commented Jul 26, 2021

@KaleabTessera: I am going to benchmark this on Flatland now.

Copy link
Collaborator

@arnupretorius arnupretorius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @jcformanek! 🔥

@arnupretorius arnupretorius merged commit 0608255 into develop Jul 28, 2021
@arnupretorius arnupretorius deleted the feature/feedforward-madqn-importance-sampling branch July 28, 2021 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants