Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug?] sim data collection combines actions with next_observation instead of observation on which the action is based #636

Open
tlpss opened this issue Jan 14, 2025 · 0 comments

Comments

@tlpss
Copy link
Contributor

tlpss commented Jan 14, 2025

In the data collection script for sim envs, the action (a_t) is determined based on the current observation (o_t).
However, when the step method is called, the observation is overwritten with the new observation o_{t+1}, and this pair (a_t, o_{t+1}} is recorded as a demonstration step. I believe this is a mistake. A simplified and corrected data collection loop is given below:

 for _ in range(n_episodes):
        obs, info = env.reset()
        done = False
        dataset_recorder.start_episode()
        while not done:
            action = agent_callable(env)
            new_obs, reward, termination, truncation, info = env.step(action)
            done = termination or truncation
            dataset_recorder.record(obs, action, reward, done, info)
            obs = new_obs
        dataset_recorder.save_episode()

Note that I also explicitly call the reset, to avoid storing the last observation with an action that is never executed (the autoreset ignores the action if step is called on an environment that needs to reset).

I have not run the script, but was merely looking for code that allowed me to collect demonstrations for my own gym Env and store them in the Lerobot dataset format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant