-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify configs #550
Simplify configs #550
Conversation
lerobot/configs/default.py
Outdated
"takes precedence.", | ||
) | ||
# Use the checkpoint config instead of the provided config (but keep `resume` parameter). | ||
self = checkpoint_cfg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty sure this doesn't do what you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed that doesn't work at all, I will handle that part once I'm done with the rest of the config (I'm on the policies right now, which is quite a big chunk).
Thanks for the heads up!
87d92f9
to
06b604b
Compare
…_30_remove_hydra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @aliberts,
I noticed a few potential typos while reading examples/4_train_policy_with_script.md
:
- equiped → equipped (line 2)
- dictionnaries → dictionaries (line 26)
- exemple → example (line 45)
I hope this is helpful!
Co-authored-by: HUANG TZU-CHUN <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A thorough shallow review
Co-authored-by: Remi <[email protected]>
…ggingface/lerobot into user/aliberts/2024_11_30_remove_hydra
Co-authored-by: Remi <[email protected]> Co-authored-by: HUANG TZU-CHUN <[email protected]>
Co-authored-by: Remi <[email protected]> Co-authored-by: HUANG TZU-CHUN <[email protected]>
Blockers
control_robot.py
,visualize_image_transforms.py
-> Remove config yaml for robot devices #594HubMixin
,PreTrainedConfig
/PreTrainedPolicy
What this does
This PR removes Hydra in favor of Draccus.
This brings significant changes to the codebase regarding how the configurations are built, saved, loaded and used. Most of the commands previously used won't work anymore but hopefully, you'll need to make minimal changes to most of them to make them work.
Overview
Configurations are now defined in the code through dataclasses rather than being defined in yaml files. The two main configuration classes are
TrainPipelineConfig
andEvalPipelineConfig
. Similarly to yaml files previously, their code is heavily commented and is meant to be read in order to understand the options or see the default values.Reading the updated examples/4_train_policy_with_script.md is a great way to have an overview of how this new config system works.
We've updated the following scripts, for which commands used before won't work:
lerobot/scripts/train.py
lerobot/scripts/eval.py
lerobot/scripts/control_robot.py
lerobot/scripts/visualize_image_transforms.py
Here are a few examples of commands before/after the changes.
Training Diffusion Policy on PushT - before
Training Diffusion Policy on PushT - after
Few things to note:
--batch_size=64
. This is because with the previous system, thebatch_size
value was included in thediffusion.yaml
, which was implicitly selected withpolicy=diffusion
. Now, the defaultbatch_size
is8
and is independent of policy selection. Same idea for--seed
here..type
. Read more about this here.--
prefix.Evaluating ACT on Aloha Transfer Cube - before
Evaluating ACT on Aloha Transfer Cube - after
Running inference of a pretrained model on a SO-100 robot - before
Running inference of a pretrained model on a SO-100 robot - after
Note that for the following scripts, we didn't update the argument parsing as we didn't feel they needed to. Therefore, these are mainly unaffected by these changes and the commands that worked before should still work with these:
lerobot/scripts/visualize_dataset.py
lerobot/scripts/visualize_dataset_html.py
lerobot/common/robot_devices/cameras/intelrealsense.py
lerobot/common/robot_devices/cameras/opencv.py
lerobot/scripts/configure_motor.py
Motivation
Our previous system for configurations had several limitations:
Changes
__post_init__
of these classes.@parser.wrap()
decorator similar to@draccus.wrap()
to preprocess command line arguments to enable.path
arguments (for policies only for now, e.g.--policy.path
)HubMixin
: A custom implementation ofhuggingface_hub.ModelHubMixin
to better fit our needs (mostly, being able to serialize/deserialize using Draccus).PreTrainedConfig
which policies config classes inherit from. Inspired bytransformers.PretrainedConfig
, this class will now manage a few common things amongst policy configs and will harmonize their interface.PreTrainedPolicy
which policy classes inherit from.Policy
Protocol in favor of directly usingPreTrainedPolicy
.parse_features_from_dataset
parse_features_from_env
OptimizerConfig
andLRSchedulerConfig
which create (through thebuild
method) standard optimizers and schedulers fromtorch.optim
ordiffusers.optimization
whenever possible, and custom implementation when needed.use_policy_training_preset
option (true by default) in the training config to allow fro selecting an optimizer/scheduler preset that comes with each policy config. Additionally, eachPreTrainedPolicy
implementsget_optim_params()
which returns a dict of parameters specific to that policy to be used by the optimizer (this is only used whenuse_policy_training_preset
is true). This adresses the issues discussed in Move functionmake_optimizer_and_scheduler
to policy #401last
symlink in checkpoints now points to the last checkpoint with a relative path (path was absolute before) which makes it easier to move things around.TODO in future PR:
control_sim_robot.py
-> we won't do it in this PRHow it was tested
This PR allows to enable back a number of tests, including integration tests. Some datasets are still pulled from the hub by the tests but much less since can select a single episode since the datasets v2.
We will refactor the tests in a further PR to make them easier to write/maintain/scale. Notably, we can now remove most of
tests/data
(we'll still keep backwards compatibility test artifacts but that's okay since they're lightweight).How to checkout & try? (for the reviewer)
Try out the new version of examples/4_train_policy_with_script.md