Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gello dataset converter #575

Open
tlpss opened this issue Dec 13, 2024 · 6 comments
Open

Gello dataset converter #575

tlpss opened this issue Dec 13, 2024 · 6 comments

Comments

@tlpss
Copy link
Contributor

tlpss commented Dec 13, 2024

I made a converter for the Gello dataset format (pickles containing dicts with all the observations).

If this is of interest, I am willing to contribute it back here.

The current code can be found here. It needs some cleanup and maybe a convenient way to specify the mapping of dict keys in case you have a different number of cameras or other sensors. Wanted to see if there is any interest in this, before I make the effort to clean it up.

@vmayoral
Copy link

@tlpss I'd be interested in partnering up on this.

Before setting on a goal though, I'd love to pick your brain on the following:

  • your current approach entails converting the teleop commands from Gello into something that lerobot can interpret, from my understanding, a posteriori
  • wouldn't it make more sense to integrate Gello abstraction within lerobot so that it can be used directly as a teleoperator. This later direction is the one I'm currently looking at.

@tlpss
Copy link
Contributor Author

tlpss commented Dec 20, 2024

Hi @vmayoral

Thanks for your thoughts.

I indeed was thinking about how to use data collected using the Gello codebase for training policies in Lerobot. (a posteriori integrating gello with lerobot as you described it). This come from our particular setup: We use GELLO arms to teleop robots, based on the original gello codebase. The codebase uses a specific format to store the interactions and I have written code to convert their dataset format into a Lerobot dataset. This dataset converter is my proposed contribution.

However, your idea of directly integrating GELLO arms into the Lerobot codebase so that it can be used as a teleop solution makes a lot of sense.

I think it is related to the question of what scope the Lerobot framework is supposed to have. We use it as a framework for policy training and separate it from the codebase we use to run our robots with and to collect demonstrations. For us (for now), that is the sweet spot in terms of flexibility. However, I can imagine there are use cases where it makes sense to also provide code for controlling the robot and for teleop within the lerobot framework! If that is indeed within scope of the repo (@Cadene @aliberts, I think this is a question for you guys), I surely would be interested to collaborate to add it to the framework! If the maintainers think this is a good idea, I propose we make a separate issue for this to further think about it and track progress?

@tlpss
Copy link
Contributor Author

tlpss commented Jan 13, 2025

@aliberts, any thoughts on this?

  1. Are you interested in supporting multiple teleop setups? (either for sim or real robots).
  2. Is there interest in a dataset convertor from the GELLO repo format (pickles) to lerobot?

@vmayoral
Copy link

Regardless of upstream interest or integration, @tlpss, here's the port of gello on top of lerobot's repo: vmayoral@6c65402

We found this of tremendous use so thought I'll drop it here in case of help.

@aliberts
Copy link
Collaborator

aliberts commented Jan 26, 2025

Hi @tlpss sorry for the late reply, I've been focusing on the current redesign.

Are you interested in supporting multiple teleop setups? (either for sim or real robots).

Yes, definitely. We're still thinking about the best way to do that so that the resulting API is easy to use and intuitive but that's on the roadmap — although not at the top of the list for now. cc @Cadene @nepyope

Is there interest in a dataset convertor from the GELLO repo format (pickles) to lerobot?

Also yes. You can PR your conversion file into examples/port_datasets.
You already have an example in that folder and we plan to add more.

The idea is to use the new dataset methods introduced in #461 with .add_frame(), add_episode(), .consolidate() and .push_to_hub(). The push_dataset_to_hub scripts are deprecated but there's probably a lot of code you already wrote that you can keep (especially for reading the Gello format).

We don't really yet have a way to add multiple tasks per-episodes (should be coming soon) but you can hack it if you need to. Feel free to ask if you need any help!

@tlpss
Copy link
Contributor Author

tlpss commented Jan 29, 2025

@aliberts

Thanks for the reply.

Yes, definitely. We're still thinking about the best way to do that so that the resulting API is easy to use and intuitive but that's on the roadmap — although not at the top of the list for now. cc @Cadene @nepyope

Ok great, for now I'm collecting the data for our real robots using a different codebase and then converting it to a LerobotDataset. Would be convenient if there was an abstraction to plug in any real robot setup and any teleop setup, while being able to use the data collection logic that is already provided here. For now (afaik) the data collection script is pretty tightly coupled to your robot setup?

An abstraction that allows users to create their own 'gym interface' to a real robot setup, and similarly for their teleop system, would help to make things more generic I believe. Let me know if that is on the roadmap, certainly interested in trying it out and possibly also to help implement it!

Also yes. You can PR your conversion file into examples/port_datasets.
You already have an example in that folder and we plan to add more.
The idea is to use the new dataset methods introduced in #461 with .add_frame(), add_episode(), .consolidate() and .push_to_hub(). The push_dataset_to_hub scripts are deprecated but there's probably a lot of code you already wrote that you can keep (especially for reading the Gello format).

Great, will do! For now my conversion file is still on dataset v1, but will find some time to update to v2 soon I hope.

PS. Is there a public roadmap? Curious to see what is planned in the (near) future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants