feature: working version of importance sampling on feedforward madqn. #275

jcformanek · 2021-07-19T13:46:27Z

What?

Implemented importance sampling (IS) / prioritized experience replay for feedforward madqn.

Why?

Hopefully importance sampling will assist performance.

How?

After each learning step we compute new priorities for the samples we drew from the replay buffer. This is done using Q-value errors. The mutate_priorities() function is used to update the priorities in the reverb table.

Extra

For now IS only works in feedforward madqn. I have also tested to make sure all of the other systems that inherit from feedforward MADQN still work. Recurrent MADQN, MADQN with comms, Dial, VDN and Qmix all still work after I made these changes.

Because so many systems inherit from feedforward MADQN, it is quite hard to make changes without breaking the other systems. So my strategy mocing forward is going to be to make very incremental upgrades to MADQN and ensure at each step that nothing breaks the other systems.

…ling

DriesSmit

This looks great thanks @jcformanek 🔥 So this PR should go in after the new adder PR?

…ling

KaleabTessera

Looks really great @jcformanek ! Thanks so much for this. The only thing left from my side is to benchmark this. After that, I am happy to approve.

jcformanek · 2021-07-26T08:56:20Z

@KaleabTessera: I am going to benchmark this on Flatland now.

arnupretorius

Nice @jcformanek! 🔥

feature: working version of importance sampling on feedforward madqn.

c456210

jcformanek added the enhancement New feature or request label Jul 19, 2021

jcformanek requested a review from arnupretorius July 19, 2021 13:46

jcformanek self-assigned this Jul 19, 2021

jcformanek requested review from DriesSmit and KaleabTessera as code owners July 19, 2021 13:46

Merge branch 'develop' into feature/feedforward-madqn-importance-samp…

a4ebf72

…ling

DriesSmit approved these changes Jul 21, 2021

View reviewed changes

Merge branch 'develop' into feature/feedforward-madqn-importance-samp…

f637102

…ling

KaleabTessera reviewed Jul 26, 2021

View reviewed changes

arnupretorius approved these changes Jul 28, 2021

View reviewed changes

arnupretorius merged commit 0608255 into develop Jul 28, 2021

arnupretorius deleted the feature/feedforward-madqn-importance-sampling branch July 28, 2021 10:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: working version of importance sampling on feedforward madqn. #275

feature: working version of importance sampling on feedforward madqn. #275

jcformanek commented Jul 19, 2021 •

edited

Loading

DriesSmit left a comment

KaleabTessera left a comment

jcformanek commented Jul 26, 2021 •

edited

Loading

arnupretorius left a comment

feature: working version of importance sampling on feedforward madqn. #275

feature: working version of importance sampling on feedforward madqn. #275

Conversation

jcformanek commented Jul 19, 2021 • edited Loading

What?

Why?

How?

Extra

DriesSmit left a comment

Choose a reason for hiding this comment

KaleabTessera left a comment

Choose a reason for hiding this comment

jcformanek commented Jul 26, 2021 • edited Loading

arnupretorius left a comment

Choose a reason for hiding this comment

jcformanek commented Jul 19, 2021 •

edited

Loading

jcformanek commented Jul 26, 2021 •

edited

Loading