Skip to content
/ idapt Public

Policy Transfer across Visual and Dynamics Domain Gaps via Iterative Grounding (RSS 2021)

License

Notifications You must be signed in to change notification settings

clvrai/idapt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Policy Transfer across Visual and Dynamics Domain Gaps via Iterative Grounding

Grace Zhang, Linghan Zhong, Youngwoon Lee, Joseph J. Lim at USC CLVR lab
[Project website] [Paper] [arXiv]

This project is a PyTorch implementation of Policy Transfer across Visual and Dynamics Domain Gaps via Iterative Grounding (RSS 2021).

The ability to transfer a policy from one environment to another is a promising avenue for efficient robot learning in realistic settings where task supervision is not available. To succeed, such policy transfer must overcome both the visual and dynamics domain gap between source and target environments. We propose IDAPT, a novel policy transfer method with iterative environment grounding that alternates between (1) directly minimizing both visual and dynamics domain gaps by grounding the source environment in the target environment, and (2) training a policy on the grounded source environment. The empirical results on locomotion and robotic manipulation tasks demonstrate that our method can effectively transfer a policy across large domain gaps with minimal interaction with the target environment.

Prerequisites

  • Ubuntu 18.04 or above
  • Python 3.6 or above
  • Mujoco 2.0

Directories

  • run.py: sets up experiment and runs training
  • training/: our method and baseline implementations
    • See domain randomization configuration instructions here.
  • config/: hyper-parameters
  • environments/: registers environments used for the paper (SawyerPush, FetchReach)

Installation

  1. Clone this repo.

    git clone https://github.com/clvrai/idapt.git
  2. Install python dependencies.

    pip install -r requirements.txt
  3. Set the environment variable for headless rendering.

    export PYOPENGL_PLATFORM="EGL"
  4. Download demonstration files with task names (e.g. InvertedPendulum, HalfCheetah, Walker2d, FetchReach, SawyerPush).

    python download_data.py [TASK_NAME]
    # example: python download_demos.py Walker2d SawyerPush

Unity app installation

Adding --unity True to the command will automatically download the Unity app.

For headless servers, virtual display needs to be executed (e.g., sudo /usr/bin/X :1) and specify the virtual display id (e.g. --virtual_display :1).

In macOS, if the app does not launch due to the not verified developer, go to the directory ./binary, right-click Furniture.app, and click Open once. Then, the app can be launched by our environment without error.

Example Commands

InvertedPendulum

  • Train ours

    python -m run --source_env=InvertedPendulum-v2 --target_env=GymInvertedPendulumDM-v0
  • Train DR-Narrow

    python -m run --source_env=GymInvertedPendulumDM-v2 --target_env=GymInvertedPendulumDM-v0 --dr=True --dr_params_set=IP_min
  • Train DR-Wide

    python -m run --source_env=GymInvertedPendulumDM-v2 --target_env=GymInvertedPendulumDM-v0 --dr=True --dr_params_set=IP_max

HalfCheetah

  • Train ours

    python -m run --source_env=HalfCheetah-v3 --target_env=GymHalfCheetahDM-v0 --data=backwards
  • Train DR-Narrow

    python -m run --source_env=HalfCheetah-v3 --target_env=GymHalfCheetahDM-v0 --dr=True --dr_params_set=HC_min
  • Train DR-Wide

    python -m run --source_env=HalfCheetah-v3 --target_env=GymHalfCheetahDM-v0 --dr=True --dr_params_set=HC_max

Walker2D

  • Train ours

    python -m run --source_env=GymWalker-v0 --target_env=GymWalkerDM-v0 --data=backwards
  • Train DR-Narrow

    python -m run --source_env=GymWalker-v0 --target_env=GymWalkerDM-v0 --dr=True --dr_params_set=WK_min
  • Train DR-Wide

    python -m run --source_env=GymWalker-v0 --target_env=GymWalkerDM-v0 --dr=True --dr_params_set=WK_max

Fetch Reach

  • Train ours

    python -m run --source_env=FetchReach-v1 --target_env=GymFetchReach-v0 --unity=True --action_rotation_degrees=45 --action_z_bias=-0.5
  • Train DR-Narrow

    python -m run --source_env=FetchReach-v1 --target_env=GymFetchReach-v0 --dr=True --dr_params_set=FR_min --action_rotation_degrees=45 --action_z_bias=-0.5
  • Train DR-Wide

    python -m run --source_env=FetchReach-v1 --target_env=GymFetchReach-v0 --dr=True --dr_params_set=FR_max --action_rotation_degrees=45 --action_z_bias=-0.5

Sawyer Push

  • Train ours

    python -m run --source_env=SawyerPushZoom-v0 --target_env=SawyerPushShiftViewZoomBackground-v0 --unity=True --target_env_puck_friction=2.0 --target_env_puck_mass=0.05
  • Train DR-Narrow

    python -m run --source_env=SawyerPushZoom-v0 --target_env=SawyerPushShiftViewZoomBackground-v0 --dr=True --dr_params_set=FR_min --action_rotation_degrees=45 --action_z_bias=-0.5
  • Train DR-Wide

    python -m run --source_env=SawyerPushZoom-v0 --target_env=SawyerPushShiftViewZoomBackground-v0 --dr=True --dr_params_set=FR_max --action_rotation_degrees=45 --action_z_bias=-0.5

Citation

@inproceedings{zhang2021policy,
  title={Policy Transfer across Visual and Dynamics Domain Gaps via Iterative Grounding},
  author={Grace Zhang and Linghan Zhong and Youngwoon Lee and Joseph J. Lim},
  booktitle={Robotics: Science and Systems},
  year={2021},
  address={Virtual},
  month={July},
  DOI={10.15607/RSS.2021.XVII.006}
}

References

About

Policy Transfer across Visual and Dynamics Domain Gaps via Iterative Grounding (RSS 2021)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages