Additional fix while retraining policies #629

Cadene · 2025-01-11T17:07:37Z

What this does

Retrain policies

How it was tested

ACT aloha insertion

train

python lerobot/scripts/train.py \
--policy.type=act \
--dataset.repo_id=lerobot/aloha_sim_insertion_human \
--env.type=aloha \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/1mfzmkyg?nw=nwuserrcadene

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-09/17-59-06_aloha_act/checkpoints/last/pretrained_model \
--env.type=aloha \
--env.task=AlohaTransferCube-v0 \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

ACT aloha transfer cube

train

python lerobot/scripts/train.py \
--policy.type=act \
--dataset.repo_id=lerobot/aloha_sim_transfer_cube_human \
--env.type=aloha \
--env.task=AlohaTransferCube-v0 \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/neuu3olc?nw=nwuserrcadene

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-10/11-41-03_aloha_act/checkpoints/last/pretrained_model \
--output_dir=outputs/train/2025-01-10/11-41-03_aloha_act/full_eval/last \
--env.type=aloha \
--env.task=AlohaTransferCube-v0 \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

{'avg_sum_reward': 212.4, 'avg_max_reward': 3.38, 'pc_success': 76.0, 'eval_s': 86.73920726776123, 'eval_ep_s': 1.7347841548919678}

Diffusion pusht

train

python lerobot/scripts/train.py \
--policy.type=diffusion \
--dataset.repo_id=lerobot/pusht \
--seed=100000 \
--env.type=pusht \
--batch_size=64 \
--offline.steps=200000 \
--eval_freq=25000 \
--save_freq=25000 \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/7yovun9s

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/15-12-08_pusht_diffusion/checkpoints/200000/pretrained_model \
--output_dir=outputs/train/2025-01-11/15-12-08_pusht_diffusion/full_eval/200000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

{'avg_sum_reward': 121.85938595512995, 'avg_max_reward': 0.9644504711735705, 'pc_success': 56.00000000000001, 'eval_s': 47.386802196502686, 'eval_ep_s': 0.9477360534667969}

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/15-12-08_pusht_diffusion/checkpoints/100000/pretrained_model \
--output_dir=outputs/train/2025-01-11/15-12-08_pusht_diffusion/full_eval/100000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

{'avg_sum_reward': 113.42846335694817, 'avg_max_reward': 0.9828476584918505, 'pc_success': 78.0, 'eval_s': 47.40688681602478, 'eval_ep_s': 0.9481377410888672}

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/15-12-08_pusht_diffusion/checkpoints/050000/pretrained_model \
--output_dir=outputs/train/2025-01-11/15-12-08_pusht_diffusion/full_eval/050000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

TDMPC xarm

train

python lerobot/scripts/train.py \
--policy.type=tdmpc \
--dataset.repo_id=lerobot/xarm_lift_medium \
--seed=1 \
--env.type=xarm \
--batch_size=256 \
--offline.steps=200000 \
--online.steps=50000 \
--online.env_seed=10000 \
--online.buffer_capacity=80000 \
--online.steps_between_rollouts=50 \
--eval_freq=5000 \
--save_freq=10000 \
--log_freq=100 \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/65b0rxz7

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/15-12-08_pusht_diffusion/checkpoints/last/pretrained_model \
--output_dir=outputs/train/2025-01-11/15-12-08_pusht_diffusion/full_eval/last \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

VQBeT pusht

train

python lerobot/scripts/train.py \
--policy.type=vqbet \
--dataset.repo_id=lerobot/pusht \
--seed=100000 \
--env.type=pusht \
--batch_size=64 \
--offline.steps=250000 \
--eval_freq=25000 \
--save_freq=25000 \
--wandb.enable=true

https://wandb.ai/rcadene/lerobot/runs/sgkstbls

eval

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/18-03-47_pusht_vqbet/checkpoints/250000/pretrained_model \
--output_dir=outputs/train/2025-01-11/18-03-47_pusht_vqbet/full_eval/250000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

{'avg_sum_reward': 96.32497890276665, 'avg_max_reward': 0.7956230464645369, 'pc_success': 46.0, 'eval_s': 27.269179582595825, 'eval_ep_s': 0.5453836011886597}

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/18-03-47_pusht_vqbet/checkpoints/100000/pretrained_model \
--output_dir=outputs/train/2025-01-11/18-03-47_pusht_vqbet/full_eval/100000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

{'avg_sum_reward': 97.06195423096551, 'avg_max_reward': 0.8539270621245656, 'pc_success': 52.0, 'eval_s': 27.543201208114624, 'eval_ep_s': 0.5508640289306641}

python lerobot/scripts/eval.py \
--policy.path=outputs/train/2025-01-11/18-03-47_pusht_vqbet/checkpoints/150000/pretrained_model \
--output_dir=outputs/train/2025-01-11/18-03-47_pusht_vqbet/full_eval/150000 \
--env.type=pusht \
--eval.n_episodes=50 \
--eval.batch_size=50 \
--device=cuda \
--use_amp=false

{'avg_sum_reward': 113.66729212298688, 'avg_max_reward': 0.844645479041044, 'pc_success': 44.0, 'eval_s': 26.88631582260132, 'eval_ep_s': 0.5377263259887696}

Cadene · 2025-01-11T17:08:19Z

lerobot/common/logger.py

@@ -121,7 +121,7 @@ def __init__(self, cfg: TrainPipelineConfig):
                notes=cfg.wandb.notes,
                tags=cfg_to_group(cfg, return_list=True),
                dir=self.log_dir,
-                config=OmegaConf.to_container(cfg, resolve=True),
+                config=draccus.encode(cfg),


TODO: remove

…DENTITY

…e/2025_01_11_remove_hydra_rl

aliberts

LGTM, thanks!

aliberts · 2025-01-24T12:39:48Z

lerobot/common/policies/vqbet/configuration_vqbet.py

-            "VISUAL": NormalizationMode.MEAN_STD,
+            "VISUAL": NormalizationMode.IDENTITY,


Note: This is because although input_normalization_modes was mean_std in vqbet.yaml config, it was a hack for not normalizing images (with normalization values of 0.5)

[Fix] Move back to manual calibration (#488) feat: enable to use multiple rgb encoders per camera in diffusion policy (#484) Co-authored-by: Alexander Soare <[email protected]> Fix config file (#495) fix: broken images and a few minor typos in README (#499) Signed-off-by: ivelin <[email protected]> Add support for Windows (#494) bug causes error uploading to huggingface, unicode issue on windows. (#450) Add distinction between two unallowed cases in name check "eval_" (#489) WIP Fix autocalib moss (#486) [Fix] Move back to manual calibration (#488) feat: enable to use multiple rgb encoders per camera in diffusion policy (#484) Co-authored-by: Alexander Soare <[email protected]> Fix config file (#495) fix: broken images and a few minor typos in README (#499) Signed-off-by: ivelin <[email protected]> Add support for Windows (#494) bug causes error uploading to huggingface, unicode issue on windows. (#450) Add distinction between two unallowed cases in name check "eval_" (#489) Rename deprecated argument (temporal_ensemble_momentum) (#490) Dataset v2.0 (#461) Co-authored-by: Remi <[email protected]> Refactor OpenX (#505) Fix missing local_files_only in record/replay (#540) Co-authored-by: Simon Alibert <[email protected]> Control simulated robot with real leader (#514) Co-authored-by: Remi <[email protected]> Update 7_get_started_with_real_robot.md (#559) LerobotDataset pushable to HF from any folder (#563) Fix example 6 (#572) fixing typo from 'teloperation' to 'teleoperation' (#566) [vizualizer] for LeRobodDataset V2 (#576) Fix broken `create_lerobot_dataset_card` (#590) Update README.md (#612) Add draccus, create MainConfig WIP refactor train.py and ACT Add policies training presets Update diffusion policy Add pusht and xarm env configs Update tdmpc Update vqbet Fix poetry relax Add feature types to envs Add EvalPipelineConfig, parse features from envs Add custom parser Update pretrained loading mechanisms Add dependency fixes & lock update Fix pretrained_path Refactor envs, remove RealEnv Fix typo Enable end-to-end tests Fix Makefile Log eval config Fix end-to-end tests Fix Quality workflow (#622) Remove amp & add resume test Speed-up tests Fix poetry relax Remove config yaml for robot devices (#594) Co-authored-by: Simon Alibert <[email protected]> fix(docs): typos in benchmark readme.md (#614) Co-authored-by: Simon Alibert <[email protected]> fix(visualise): use correct language description for each episode id (#604) Co-authored-by: Simon Alibert <[email protected]> typo fix: batch_convert_dataset_v1_to_v2.py (#615) Co-authored-by: Simon Alibert <[email protected]> [viz] Fixes & updates to html visualizer (#617) Fix logger Remove hydra-core Add aggregate_stats Add estimate_num_samples for images, Add test image Remove NoneSchedulerConfig Add push_pretrained Remove eval.episode_length Fix wandb_video Fix typo Add features back into policy configs (#643) fixes to SO-100 readme (#600) Co-authored-by: Philip Fung <no@one> Co-authored-by: Simon Alibert <[email protected]> Fix for the issue #638 (#639) Fix env_to_policy_features call Fix wandb init remove omegaconf Add branch arg Move deprecated Move training config Remove pathable_args Implement custom HubMixin Fixes Implement PreTrainedPolicy base class Add HubMixin to TrainPipelineConfig Udpate example 2 & 3 Update push_pretrained Bump`rerun-sdk` dependency to `0.21.0` (#618) Co-authored-by: Simon Alibert <[email protected]> Fix config_class Fix from_pretrained kwargs Remove policy_protocol Camelize PretrainedConfig Additional fix while retraining policies (#629) Co-authored-by: Simon Alibert <[email protected]> Actually reactivate tdmpc online test Update example 4 Remove advanced example 1 Remove example 5 Move example 6 to advanced Use HubMixin.save_pretrained Enable config_path to be a repo_id Dry has_method Update example 4 Update README Cleanup pyproject.toml Update eval docstring Update README Clean example 4 Update README Make 'last' checkpoint symlink relative Fix cluster image (#653) Simplify example 4 fix stats per episodes and aggregate stats and casting to tensor

Cadene added 2 commits January 9, 2025 16:10

Fix bug when wandb.enable=True

aa93aa1

Fix wandb log + RL

0d0f290

Cadene changed the title ~~Additional fix~~ Additional fix while retraining policies Jan 11, 2025

Cadene commented Jan 11, 2025

View reviewed changes

Cadene requested a review from aliberts January 11, 2025 17:11

Cadene and others added 6 commits January 12, 2025 17:14

Fix decoding with None not found for NormalizationMode. Replaced by I…

fced457

…DENTITY

Merge branch 'user/aliberts/2024_11_30_remove_hydra' into user/rcaden…

bfb8070

…e/2025_01_11_remove_hydra_rl

Fix norm_mode

0056093

Enable shorter env episode lenghts

5dbef3e

Fix online buffer

cf777e9

Reactivate tdmpc online integration test

3cfbb20

aliberts approved these changes Jan 24, 2025

View reviewed changes

aliberts marked this pull request as ready for review January 24, 2025 13:11

aliberts merged commit 4e7c4dd into user/aliberts/2024_11_30_remove_hydra Jan 24, 2025
1 check passed

aliberts deleted the user/rcadene/2025_01_11_remove_hydra_rl branch January 24, 2025 13:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional fix while retraining policies #629

Additional fix while retraining policies #629

Cadene commented Jan 11, 2025 •

edited by aliberts

Loading

Cadene Jan 11, 2025

aliberts left a comment

aliberts Jan 24, 2025

		"VISUAL": NormalizationMode.MEAN_STD,
		"VISUAL": NormalizationMode.IDENTITY,

Additional fix while retraining policies #629

Additional fix while retraining policies #629

Conversation

Cadene commented Jan 11, 2025 • edited by aliberts Loading

What this does

How it was tested

ACT aloha insertion

ACT aloha transfer cube

Diffusion pusht

TDMPC xarm

VQBeT pusht

Cadene Jan 11, 2025

Choose a reason for hiding this comment

aliberts left a comment

Choose a reason for hiding this comment

aliberts Jan 24, 2025

Choose a reason for hiding this comment

Cadene commented Jan 11, 2025 •

edited by aliberts

Loading