imitation-learning reinforcement-learning robotics icra2021 inverse-reinforcement-learning self-attention transformer

Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model

[ 📺 Website | 🏗 Github Repo | 🎓 Paper ]

Dependencies

Gym >= 0.8.1
Mujoco-py >= 0.5.7
Tensorflow >= 1.0.1

Mujoco

Add the following path to ~/.bashrc

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco200/bin
export LD_LIBRARY_PATH=$HOME/.mujoco/mjpro150/bin:$LD_LIBRARY_PATH

Follow instructions to install mujoco_py v1.5 here.

SenseAct (Optional)

SenseAct uses Python3 (>=3.5), and all other requirements are automatically installed via pip.

On Linux and Mac OS X, run the following:
```
git clone https://github.com/kindredresearch/SenseAct.git
cd SenseAct
pip install -e .
```

How to Run

Collect demonstration data and save to expert_data directory.

The expert data should be a python pickle file (with .bin but not .pkl as a suffix) It has batch_size, action, states (required by set_er_stats()), like the expert_data/hopper_er.bin (just as an example).

Training

COUNTER=1
ENVS+='Ant-v2'
for ENV_ID in ${ENVS[@]}
do
  CUDA_VISIBLE_DEVICES=`expr $COUNTER % 4` python main.py --env_name $ENV_ID --alg mairlImit --obs_mode state &
  COUNTER=$((COUNTER+1))
done

Evaluation

CUDA_VISIBLE_DEVICES=0 python main.py --env_name Ant-v2 --train_mode False

Acknowledgement

Our code is based on itaicaspi/mgail, HumanCompatibleAI/imitation, huggingface/transformers.

Reference

Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model
Jiankai Sun, Lantao Yu, Pinqian Dong, Bo Lu, and Bolei Zhou
In IEEE Robotics and Automation Letters (RA-L) 2021
[Paper] [Project Page]

@ARTICLE{sun2021adversarial,
     author={J. {Sun} and L. {Yu} and P. {Dong} and B. {L} and B. {Zhou}},
     journal={IEEE Robotics and Automation Letters},
     title={Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model},
     year={2021},
}

About

[RA-L & ICRA 2021] Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model

imitation-learning reinforcement-learning robotics icra2021 inverse-reinforcement-learning self-attention transformer

MIT License

Languages

Language:Python 100.0%