Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATARI environments - part1 #277

Merged
merged 88 commits into from
Apr 12, 2023
Merged

ATARI environments - part1 #277

merged 88 commits into from
Apr 12, 2023

Conversation

JulienT01
Copy link
Collaborator

@JulienT01 JulienT01 commented Feb 15, 2023

PART1.
Add and update code to train on atari games (DQN and PPO on ALE/Breakout and ALE/Freewy):

  • Udate 'agents/torch/utils/models.py' to have better management of the size to connect the CNN and the head (MLP). And to manage the "multi-batch" dimensions. Atari wrapper give input by batch (for the dynamics), and DQN use chunk, so we have to merge this 2 kind of batch in 1, make the forward, then split the result in the previous batches
  • update 'agents/torch/utils/training.py' to manage the 'transpose_obs' parameters in automatic model_config (and don't overwritte previous settings)

TODO part 2 (Other PR) : #285

  • update atari_make (in gym_make.py) and 'DQN agent' to generate and manage vectorized environments.
  • update A2C and PPO agent to manage the vectorized environments

@JulienT01 JulienT01 added documentation Improvements or additions to documentation enhancement New feature or request dependencies Pull requests that update a dependency file ready for review labels Feb 15, 2023
Copy link
Collaborator

@mmcenta mmcenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one note: I need help understanding what the ScalarizeEnvWrapper is doing here. It feels like it really only should be used when n_envs = 1; otherwise, you're giving the same action to potentially different environments, right?

Otherwise, LGTM.

def convolutions(self, x):
x = x.float()
if len(x.shape) == 3:
if (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move this comment to the line above so black doesn't split this into three lines

scalarize = True

if "atari_wrappers_dict" in kwargs.keys():
atari_wrappers_dict = kwargs["atari_wrappers_dict"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just do atari_wrappers_dict = kwargs.pop('atari_wrappers_dict') for the same effect.

@@ -32,14 +33,59 @@ def gym_make(id, wrap_spaces=False, **kwargs):
return Wrapper(env, wrap_spaces=wrap_spaces)


def atari_make(id, scalarize=True, **kwargs):
def atari_make(id, scalarize=None, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why True to None? Shouldn't it be False?

Copy link
Collaborator

@TimotheeMathieu TimotheeMathieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small comment, otherwise LGTM.

@@ -32,14 +33,58 @@ def gym_make(id, wrap_spaces=False, **kwargs):
return Wrapper(env, wrap_spaces=wrap_spaces)


def atari_make(id, scalarize=True, **kwargs):
def atari_make(id, scalarize=False, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small docstring?

@JulienT01 JulienT01 merged commit 5031054 into rlberry-py:main Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation enhancement New feature or request ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants