Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Flatten dict-typed observations before comparing them. #49740

Closed
wants to merge 0 commits into from

Conversation

0Pinky0
Copy link

@0Pinky0 0Pinky0 commented Jan 9, 2025

When new episodes are added into replay buffers, thier first obs will be compared with the last obs from previous episode in the buffer inside concat_episode method to make sure they are same obs:

assert np.all(other.observations[0] == self.observations[-1])

Currently these obs are assumed to be pure ndarray, and dict-typed obs will raise "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()".

This PR leverages flatten_inputs_to_1d_tensor method from rllib.utils.numpy to flatten dict-typed obs before the comparison.

@jcotant1 jcotant1 added the rllib RLlib related issues label Jan 9, 2025
@0Pinky0 0Pinky0 closed this Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rllib RLlib related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants