Policy Stitching: Learning Transferable Robot Policies

Pingcheng Jian, Easop Lee, Zachary Bell, Michael M. Zavlanos, Boyuan Chen
Duke University

Project Website | Video | Paper

Overview

This repo contains the implementation for paper Policy Stitching: Learning Transferable Robot Policies.

Citation

If you find our paper or codebase helpful, please consider citing:

@inproceedings{jian2023policy,
  title={Policy Stitching: Learning Transferable Robot Policies},
  author={Jian, Pingcheng and Lee, Easop and Bell, Zachary and Zavlanos, Michael M and Chen, Boyuan},
  booktitle={7th Annual Conference on Robot Learning},
  year={2023}
}

installation

The development tools of this project can be installed with conda:

$ conda env create -f environment.yml .

training

Train modular policy of the Panda Robot arm in simulation from scratch

mpirun -np 7 python -u module_sac_train_panda_PS.py --env-name='PandaPush-v2' --n-epochs=200 --device cuda:0 --seed 101 --save_data --save_model

Few-shot fine-tune the stitched policy of the Panda Robot arm in simulation

mpirun -np 7 python -u module_sac_train_panda_PS_few_shot.py --env-name='PandaL3Push-v2' --ro-env-name PandaL3Push-v3 --ta-env-name PandaPush-v2 --n-epochs=200 --device cuda:0 --seed 101 --save_data --save_model

Train modular policy of the UR5 Robot arm in simulation from scratch

mpirun -np 7 python -u module_sac_train_ur5_PS.py --env-name='Ur5Push1' --n-epochs=200 --device cuda:0 --seed 101 --control_type='joint' --save_data --save_model

Few-shot fine-tune the stitched policy of the UR5 Robot arm in simulation

mpirun -np 7 python -u module_sac_train_ur5_PS_few_shot.py --env-name='Ur5Push1' --ro-env-name Ur5Push4 --ta-env-name Ur5L5Push1 --n-epochs=200 --device cuda:0 --seed 101 --control_type='joint' --save_data --save_model

Few-shot fine-tune the stitched policy of the UR5 Robot arm in real world

mpirun -np 7 python -u module_sac_train_real_ur5_PS.py --env-name='Ur5Push1' --n-epochs=200 --device cuda:0 --seed 101 --control_type='joint' --save_data --save_model

testing

test modular policy of the Panda Robot arm in simulation

python module_sac_test_panda_PS.py --device cpu --env-name PandaL3Push-v2 --ro-env-name PandaL3Push-v3 --ta-env-name PandaPush-v2 --ta_seed 101 --ro_seed 101

test modular policy of the UR5 Robot arm in simulation

python module_sac_test_ur5_PS.py --device cpu --env-name='Ur5Push1' --ro-env-name Ur5Push4 --ta-env-name Ur5L5Push1 --ta_seed 101 --ro_seed 101

test modular policy of the UR5 Robot arm in real world

python module_sac_test_real_ur5_PS.py --device cpu --env-name='Ur5Push1' --ro-env-name Ur5Push4 --ta-env-name Ur5L5Push1 --ta_seed 101 --ro_seed 101

Experiment setup for the Panda robot arm in the simulation

Experiment setup for the UR5 robot arm in the simulation and the real world

Acknowledgement

This project refers to the github repositories panda-gym, pybullet_ur5_robotiq, and hindsight-experience-replay.

generalroboticslab / Policy-Stitching