[Feature Request] Support for distributional-DQNalgorithms (C51, Rainbow) #2269

roger-creus · 2024-07-05T00:49:02Z

Is the Distributional Q-Value Actor currently fully supported? If so, are there any plans to integrate C51 and more importantly, Rainbow, to the list of sota-implementations?

vmoens · 2024-07-05T10:49:05Z

We have a version of this here
https://pytorch.org/rl/stable/reference/generated/torchrl.objectives.DistributionalDQNLoss.html#torchrl.objectives.DistributionalDQNLoss
but I don't think we have an official version of Rainbow yet (although this is the first thing we had in the lib - for some reason we never made a script that was high-quality enough to be made public!)
LMK if you need further help with it!

roger-creus · 2024-07-05T15:49:34Z

I have implemented a first version of Rainbow containing all tricks! (Dueling DQN, Distributional, Prioritized Experience, etc.) and I am now running some preliminary experiments to debug its performance and make sure it works well.

However, I had to change this line to Tz = reward + (1 - terminated.to(reward.dtype)) * discount.unsqueeze(-1) * support.repeat(batch_size, 1).

Otherwise I would get shape errors. Let me know if this makes sense!

roger-creus added the enhancement New feature or request label Jul 5, 2024

roger-creus assigned vmoens Jul 5, 2024

roger-creus mentioned this issue Jul 5, 2024

[BugFix] Fixed shape for MultiStep returns + Distributional loss #2270

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support for distributional-DQNalgorithms (C51, Rainbow) #2269

[Feature Request] Support for distributional-DQNalgorithms (C51, Rainbow) #2269

roger-creus commented Jul 5, 2024 •

edited

Loading

vmoens commented Jul 5, 2024

roger-creus commented Jul 5, 2024

[Feature Request] Support for distributional-DQNalgorithms (C51, Rainbow) #2269

[Feature Request] Support for distributional-DQNalgorithms (C51, Rainbow) #2269

Comments

roger-creus commented Jul 5, 2024 • edited Loading

vmoens commented Jul 5, 2024

roger-creus commented Jul 5, 2024

roger-creus commented Jul 5, 2024 •

edited

Loading