HanzhiC / RLAfford

RLAfford: End-to-End Affordance Learning for Robotic Manipulation, ICRA 2023

Home Page:https://sites.google.com/view/rlafford/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RLAfford

Official Implementation of "RLAfford: End-to-end Affordance Learning with Reinforcement Learning" ICRA 2023

Introduction

Learning to manipulate 3D articulated objects in an interactive environment has been challenging in reinforcement learning (RL) studies. It is hard to train a policy that can generalize over different objects with vast semantic categories, diverse shape geometry, and versatile functionality.

Visual affordance provides object-centric information priors that offer actionable semantics for objects with movable parts. For example, an effective policy should know the pulling force on the handle to open a door.

Nevertheless, how to learn affordance in an end-to-end fashion within the RL process is unknown. In this study, we fill such a research gap by designing algorithms that can automatically learn affordance semantics through a contact prediction process.

The contact predictor allows the agent to learn the affordance information (i.e., where to act for the robotic arm on the object) from previous manipulation experience, and such affordance semantics then helps the agent learn effective policies through RL updates.
We use our framework on several downstream tasks. The experimental result and analysis demonstrate the effectiveness of end-to-end affordance learning.

Requirements

We test our code in NVIDIA-driver version $\geq$ 515, cuda Version $\geq$ 11.7 and python $\geq$ 3.8 environment can run successfully, if the version is not correct may lead to errors, such as segmentation fault.

Some dependencies can be installed by

pip install -r ./requirements.txt

Our framework is implemented on Isaac Gym simulator, the version we used is Preview Release 4. You may encounter errors in installing packages, most solutions can be found in the official docs.

Install pointnet++ manually.

cd {the dir for packages}
git clone --recursive https://github.com/erikwijmans/Pointnet2_PyTorch
cd Pointnet2_PyTorch
# [IMPORTANT] comment these two lines of code:
#   https://github.com/erikwijmans/Pointnet2_PyTorch/blob/master/pointnet2_ops_lib/pointnet2_ops/_ext-src/src/sampling_gpu.cu#L100-L101
pip install -r requirements.txt
pip install -e .

Finally, run the following to install other packages.

# make sure you are at the repository root directory
pip install -r requirements.txt
cd {the dir for packages}
git clone https://github.com/haosulab/ManiSkill-Learn.git
cd ManiSkill-Learn/
pip install -e .

Dataset Preparation

Download the dataset from google drive and extract it. Move the asset folder to the root of this project. The dataset includes objects from SAPIEN dataset along with additional information processed by us. Code to prepare the dataset can be accessed in this github repo.

The part of the dataset for Pick and Place task consists of objects from different sources, so it took us some time to get all the license and approval we need to release the data. Now you can found it from another googl drive.

Reproduce the Results

Once the dataset is ready, you will be able to run the whole training and testing process using the command in Experiments.md.

Draw the Pointcloud

We used Mitsuba3 to draw pointcloud and affordance map. Mitsuba provided beautiful visualizations. Scripts can be accesed in the repo Visualization .

Cite

@inproceedings{geng2023rlafford,
  title={RLAfford: End-to-End Affordance Learning for Robotic Manipulation},
  author={Geng, Yiran and An, Boshi and Geng, Haoran and Chen, Yuanpei and Yang, Yaodong and Dong, Hao},
  booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={5880--5886},
  year={2023},
  organization={IEEE}
}

Feel free to contact hao.dong@pku.edu.cn for collaboration

About

RLAfford: End-to-End Affordance Learning for Robotic Manipulation, ICRA 2023

https://sites.google.com/view/rlafford/


Languages

Language:Python 93.3%Language:Shell 3.3%Language:HTML 1.5%Language:Cuda 1.1%Language:C++ 0.8%Language:C 0.1%