Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional fix while retraining policies #629

Conversation

Cadene
Copy link
Collaborator

@Cadene Cadene commented Jan 11, 2025

What this does

  • Retrain policies

How it was tested

ACT aloha insertion

train

python lerobot/scripts/train.py \
--policy.type=act \
--dataset.repo_id=lerobot/aloha_sim_insertion_human \
--env.type=aloha \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/1mfzmkyg?nw=nwuserrcadene

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-09/17-59-06_aloha_act/checkpoints/last/pretrained_model \
--env.type=aloha \
--env.task=AlohaTransferCube-v0 \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

ACT aloha transfer cube

train

python lerobot/scripts/train.py \
--policy.type=act \
--dataset.repo_id=lerobot/aloha_sim_transfer_cube_human \
--env.type=aloha \
--env.task=AlohaTransferCube-v0 \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/neuu3olc?nw=nwuserrcadene

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-10/11-41-03_aloha_act/checkpoints/last/pretrained_model \
--output_dir=outputs/train/2025-01-10/11-41-03_aloha_act/full_eval/last \
--env.type=aloha \
--env.task=AlohaTransferCube-v0 \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false
{'avg_sum_reward': 212.4, 'avg_max_reward': 3.38, 'pc_success': 76.0, 'eval_s': 86.73920726776123, 'eval_ep_s': 1.7347841548919678}

Diffusion pusht

train

python lerobot/scripts/train.py \
--policy.type=diffusion \
--dataset.repo_id=lerobot/pusht \
--seed=100000 \
--env.type=pusht \
--batch_size=64 \
--offline.steps=200000 \
--eval_freq=25000 \
--save_freq=25000 \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/7yovun9s

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/15-12-08_pusht_diffusion/checkpoints/200000/pretrained_model \
--output_dir=outputs/train/2025-01-11/15-12-08_pusht_diffusion/full_eval/200000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false
{'avg_sum_reward': 121.85938595512995, 'avg_max_reward': 0.9644504711735705, 'pc_success': 56.00000000000001, 'eval_s': 47.386802196502686, 'eval_ep_s': 0.9477360534667969}
python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/15-12-08_pusht_diffusion/checkpoints/100000/pretrained_model \
--output_dir=outputs/train/2025-01-11/15-12-08_pusht_diffusion/full_eval/100000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false
{'avg_sum_reward': 113.42846335694817, 'avg_max_reward': 0.9828476584918505, 'pc_success': 78.0, 'eval_s': 47.40688681602478, 'eval_ep_s': 0.9481377410888672}
python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/15-12-08_pusht_diffusion/checkpoints/050000/pretrained_model \
--output_dir=outputs/train/2025-01-11/15-12-08_pusht_diffusion/full_eval/050000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

TDMPC xarm

train

python lerobot/scripts/train.py \
--policy.type=tdmpc \
--dataset.repo_id=lerobot/xarm_lift_medium \
--seed=1 \
--env.type=xarm \
--batch_size=256 \
--offline.steps=200000 \
--online.steps=50000 \
--online.env_seed=10000 \
--online.buffer_capacity=80000 \
--online.steps_between_rollouts=50 \
--eval_freq=5000 \
--save_freq=10000 \
--log_freq=100 \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/65b0rxz7

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/15-12-08_pusht_diffusion/checkpoints/last/pretrained_model \
--output_dir=outputs/train/2025-01-11/15-12-08_pusht_diffusion/full_eval/last \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

VQBeT pusht

train

python lerobot/scripts/train.py \
--policy.type=vqbet \
--dataset.repo_id=lerobot/pusht \
--seed=100000 \
--env.type=pusht \
--batch_size=64 \
--offline.steps=250000 \
--eval_freq=25000 \
--save_freq=25000 \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/sgkstbls

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/18-03-47_pusht_vqbet/checkpoints/250000/pretrained_model \
--output_dir=outputs/train/2025-01-11/18-03-47_pusht_vqbet/full_eval/250000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false
{'avg_sum_reward': 96.32497890276665, 'avg_max_reward': 0.7956230464645369, 'pc_success': 46.0, 'eval_s': 27.269179582595825, 'eval_ep_s': 0.5453836011886597}
python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/18-03-47_pusht_vqbet/checkpoints/100000/pretrained_model \
--output_dir=outputs/train/2025-01-11/18-03-47_pusht_vqbet/full_eval/100000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false
{'avg_sum_reward': 97.06195423096551, 'avg_max_reward': 0.8539270621245656, 'pc_success': 52.0, 'eval_s': 27.543201208114624, 'eval_ep_s': 0.5508640289306641}
python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/18-03-47_pusht_vqbet/checkpoints/150000/pretrained_model \
--output_dir=outputs/train/2025-01-11/18-03-47_pusht_vqbet/full_eval/150000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false
{'avg_sum_reward': 113.66729212298688, 'avg_max_reward': 0.844645479041044, 'pc_success': 44.0, 'eval_s': 26.88631582260132, 'eval_ep_s': 0.5377263259887696}

@Cadene Cadene changed the title Additional fix Additional fix while retraining policies Jan 11, 2025
@@ -121,7 +121,7 @@ def __init__(self, cfg: TrainPipelineConfig):
notes=cfg.wandb.notes,
tags=cfg_to_group(cfg, return_list=True),
dir=self.log_dir,
config=OmegaConf.to_container(cfg, resolve=True),
config=draccus.encode(cfg),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: remove

@Cadene Cadene requested a review from aliberts January 11, 2025 17:11
Copy link
Collaborator

@aliberts aliberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

"VISUAL": NormalizationMode.MEAN_STD,
"VISUAL": NormalizationMode.IDENTITY,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is because although input_normalization_modes was mean_std in vqbet.yaml config, it was a hack for not normalizing images (with normalization values of 0.5)

@aliberts aliberts marked this pull request as ready for review January 24, 2025 13:11
@aliberts aliberts merged commit 4e7c4dd into user/aliberts/2024_11_30_remove_hydra Jan 24, 2025
1 check passed
@aliberts aliberts deleted the user/rcadene/2025_01_11_remove_hydra_rl branch January 24, 2025 13:12
aliberts pushed a commit that referenced this pull request Feb 9, 2025
[Fix] Move back to manual calibration (#488)

feat: enable to use multiple rgb encoders per camera in diffusion policy (#484)

Co-authored-by: Alexander Soare <[email protected]>

Fix config file (#495)

fix: broken images and a few minor typos in README (#499)

Signed-off-by: ivelin <[email protected]>

Add support for Windows (#494)

bug causes error uploading to huggingface, unicode issue on windows. (#450)

Add distinction between two unallowed cases in name check "eval_" (#489)

WIP

Fix autocalib moss (#486)

[Fix] Move back to manual calibration (#488)

feat: enable to use multiple rgb encoders per camera in diffusion policy (#484)

Co-authored-by: Alexander Soare <[email protected]>

Fix config file (#495)

fix: broken images and a few minor typos in README (#499)

Signed-off-by: ivelin <[email protected]>

Add support for Windows (#494)

bug causes error uploading to huggingface, unicode issue on windows. (#450)

Add distinction between two unallowed cases in name check "eval_" (#489)

Rename deprecated argument (temporal_ensemble_momentum) (#490)

Dataset v2.0 (#461)

Co-authored-by: Remi <[email protected]>

Refactor OpenX (#505)

Fix missing local_files_only in record/replay (#540)

Co-authored-by: Simon Alibert <[email protected]>

Control simulated robot with real leader (#514)

Co-authored-by: Remi <[email protected]>

Update 7_get_started_with_real_robot.md (#559)

LerobotDataset pushable to HF from any folder (#563)

Fix example 6 (#572)

fixing typo from 'teloperation' to 'teleoperation' (#566)

[vizualizer] for LeRobodDataset V2 (#576)

Fix broken `create_lerobot_dataset_card`  (#590)

Update README.md (#612)

Add draccus, create MainConfig

WIP refactor train.py and ACT

Add policies training presets

Update diffusion policy

Add pusht and xarm env configs

Update tdmpc

Update vqbet

Fix poetry relax

Add feature types to envs

Add EvalPipelineConfig, parse features from envs

Add custom parser

Update pretrained loading mechanisms

Add dependency fixes & lock update

Fix pretrained_path

Refactor envs, remove RealEnv

Fix typo

Enable end-to-end tests

Fix Makefile

Log eval config

Fix end-to-end tests

Fix Quality workflow (#622)

Remove amp & add resume test

Speed-up tests

Fix poetry relax

Remove config yaml for robot devices (#594)

Co-authored-by: Simon Alibert <[email protected]>

fix(docs): typos in benchmark readme.md (#614)

Co-authored-by: Simon Alibert <[email protected]>

fix(visualise): use correct language description for each episode id (#604)

Co-authored-by: Simon Alibert <[email protected]>

typo fix: batch_convert_dataset_v1_to_v2.py (#615)

Co-authored-by: Simon Alibert <[email protected]>

[viz] Fixes & updates to html visualizer (#617)

Fix logger

Remove hydra-core

Add aggregate_stats

Add estimate_num_samples for images, Add test image

Remove NoneSchedulerConfig

Add push_pretrained

Remove eval.episode_length

Fix wandb_video

Fix typo

Add features back into policy configs (#643)

fixes to SO-100 readme (#600)

Co-authored-by: Philip Fung <no@one>
Co-authored-by: Simon Alibert <[email protected]>

Fix for the issue #638 (#639)

Fix env_to_policy_features call

Fix wandb init

remove omegaconf

Add branch arg

Move deprecated

Move training config

Remove pathable_args

Implement custom HubMixin

Fixes

Implement PreTrainedPolicy base class

Add HubMixin to TrainPipelineConfig

Udpate example 2 & 3

Update push_pretrained

Bump`rerun-sdk` dependency to `0.21.0` (#618)

Co-authored-by: Simon Alibert <[email protected]>

Fix config_class

Fix from_pretrained kwargs

Remove policy_protocol

Camelize PretrainedConfig

Additional fix while retraining policies (#629)

Co-authored-by: Simon Alibert <[email protected]>

Actually reactivate tdmpc online test

Update example 4

Remove advanced example 1

Remove example 5

Move example 6 to advanced

Use HubMixin.save_pretrained

Enable config_path to be a repo_id

Dry has_method

Update example 4

Update README

Cleanup pyproject.toml

Update eval docstring

Update README

Clean example 4

Update README

Make 'last' checkpoint symlink relative

Fix cluster image (#653)

Simplify example 4

fix stats per episodes and aggregate stats and casting to tensor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants