Learning to manipulate

This repository contains source code of the following papers:

Learning to combine primitive skills: A step towards versatile robotic manipulation (arxiv); Project webpage link
Learning to Augment Synthetic Images for Sim2Real Policy Transfer (IROS'19); Project webpage link

Learning to combine primitive skills: A step towards versatile robotic manipulation

To reproduce the paper experiments, follow the steps:

Quickstart

Clone repo:

$ git clone https://github.com/rstrudel/rlbc.git rlbc
$ export RLBC_ROOT=$(pwd)/rlbc
$ export RLBC_MODELS=$RLBC_ROOT/models
$ export RLBC_DATA=$RLBC_ROOT/data
$ mkdir $RLBC_DATA; mkdir $RLBC_MODELS
$ export PYTHONPATH=$PYTHONPATH:$RLBC_ROOT

Install requirements:

$ virtualenv -p $(which python3.7) rlbc_env
$ source rlbc_env/bin/activate

$ cd $RLBC_ROOT
$ pip install --upgrade pip
$ pip install -r requirements.txt

Install Manipulation Imitation Environments (MImE):

git clone https://github.com/ikalevatykh/mime-release.git
cd mime-release
pip install -r requirements.txt
python setup.py develop

BC skills training

Collect a dataset with expert trajectories

python3 -m bc.collect_demos with $RLBC_ROOT/configs/rlbc/section5/1.pick_collect_demos.json

Train a control policy

python3 -m bc.train with $RLBC_ROOT/configs/rlbc/section5/2.pick_train_policy.json

Evaluate the policy

python3 -m bc.collect_demos with $RLBC_ROOT/configs/rlbc/section5/3.pick_evaluate_policy.json

Render the policy

Edit $RLBC_ROOT/configs/rlbc/section5/4.pick_render_policy.json and put the best found epoch number there.

python3 -m bc.collect_demos with $RLBC_ROOT/configs/rlbc/section5/4.pick_render_policy.json

RLBC training

Collect a dataset with expert skills trajectories

python3 -m bc.collect_demos with $RLBC_ROOT/configs/rlbc/section6/1.bowl_collect_demos.json

Train a skills policy

python3 -m bc.train with $RLBC_ROOT/configs/rlbc/section6/2.bowl_train_skills.json

Evaluate the skills policy

python3 -m bc.collect_demos with $RLBC_ROOT/configs/rlbc/section6/3.bowl_evaluate_skills.json

Train an RLBC policy

Edit $RLBC_ROOT/configs/rlbc/section6/4.bowl_train_rlbc.json and put the best found epoch number there (bc.bc_model_epoch argument). Note that to reproduce the paper results, you may need to train up to 8 policies (by changing general.seed argument).

python3 -m ppo.train.run with $RLBC_ROOT/configs/rlbc/section6/4.bowl_train_rlbc.json

Evaluate the RLBC policy

python3 -m bc.collect_demos with $RLBC_ROOT/configs/rlbc/section6/5.bowl_evaluate_rlbc.json

Learning to Augment Synthetic Images for Sim2Real Policy Transfer

To train a policy for a real-world UR5 arm, follow the steps:

Collect 200 pairs of robot images and cube positions on a real robot

Save the dataset to $RLBC_DATA/pick_real.

Collect 20000 pairs of robot images and cube positions in simulation

python3 -m bc.collect_images with $RLBC_ROOT/configs/autoaug/1.collect_20k_images.json

Pretrain a cube position estimation network on a big simulation dataset

python3 -m bc.train with $RLBC_ROOT/configs/autoaug/2.pretrain_checkpoint.json

Evaluate epochs of the regression network

python3 -m bc.eval_reg -n regression_checkpoint -d pick_20k

Edit $RLBC_ROOT/configs/autoaug/4.train_mcts.json and put the best found epoch number there.

Train MCTS using a small simulation dataset

python3 -m sim2real.train with $RLBC_ROOT/configs/autoaug/4.train_mcts.json

Edit sim2real.augmentation and add the best augmentation (path) with the name mcts_learned.

Collect expert trajectories of picking up a cube

python3 -m bc.collect_demos with $RLBC_ROOT/configs/autoaug/5.collect_demos.json

Train a control policy on augmented expert trajectories

python3 -m bc.train with $RLBC_ROOT/configs/autoaug/6.train_policy.json

Evaluate the control policy in simulation

python3 -m bc.collect_demos with $RLBC_ROOT/configs/autoaug/7.evaluate_policy.json

Execute the best control policy epoch on a real robot

Enjoy!

Citation

If you find this repository helpful, please cite our work:

@inproceedings{rlbc2020,
  author    = {Robin Strudel and Alexander Pashevich and Igor Kalevatykh and Ivan Laptev and Josef Sivic and Cordelia Schmid},
  title     = {Learning to combine primitive skills: A step towards versatile robotic manipulation},
  booktitle = {ICRA},
  year      = {2020},
}

@inproceedings{learningsim2real2019,
  author    = {Alexander Pashevich and Robin Strudel and Igor Kalevatykh and Ivan Laptev and Cordelia Schmid},
  title     = {Learning to Augment Synthetic Images for Sim2Real Policy Transfer},
  booktitle = {IROS},
  year      = {2019},
}

StaminaTang / rlbc