[RLlib] Flatten dict-typed observations before comparing them. #49740

0Pinky0 · 2025-01-09T10:34:25Z

When new episodes are added into replay buffers, thier first obs will be compared with the last obs from previous episode in the buffer inside concat_episode method to make sure they are same obs:

assert np.all(other.observations[0] == self.observations[-1])

Currently these obs are assumed to be pure ndarray, and dict-typed obs will raise "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()".

This PR leverages flatten_inputs_to_1d_tensor method from rllib.utils.numpy to flatten dict-typed obs before the comparison.

0Pinky0 requested review from sven1977 and simonsays1980 as code owners January 9, 2025 10:34

jcotant1 added the rllib RLlib related issues label Jan 9, 2025

0Pinky0 closed this Jan 10, 2025

0Pinky0 force-pushed the master branch from 70d62f9 to 7c2a200 Compare January 10, 2025 06:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Flatten dict-typed observations before comparing them. #49740

[RLlib] Flatten dict-typed observations before comparing them. #49740

0Pinky0 commented Jan 9, 2025

[RLlib] Flatten dict-typed observations before comparing them. #49740

[RLlib] Flatten dict-typed observations before comparing them. #49740

Conversation

0Pinky0 commented Jan 9, 2025