decisionforce / MAIRL

[RA-L & ICRA 2021] Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model

[ 📺 Website | 🏗 Github Repo | 🎓 Paper ]

Dependencies

  • Gym >= 0.8.1

  • Mujoco-py >= 0.5.7

  • Tensorflow >= 1.0.1

  • Mujoco

    Add the following path to ~/.bashrc

    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco200/bin
    export LD_LIBRARY_PATH=$HOME/.mujoco/mjpro150/bin:$LD_LIBRARY_PATH
    

    Follow instructions to install mujoco_py v1.5 here.

  • SenseAct (Optional)

    SenseAct uses Python3 (>=3.5), and all other requirements are automatically installed via pip.

    On Linux and Mac OS X, run the following:

    git clone https://github.com/kindredresearch/SenseAct.git
    cd SenseAct
    pip install -e .
    

How to Run

  1. Collect demonstration data and save to expert_data directory.

The expert data should be a python pickle file (with .bin but not .pkl as a suffix) It has batch_size, action, states (required by set_er_stats()), like the expert_data/hopper_er.bin (just as an example).

  1. Training
COUNTER=1
ENVS+='Ant-v2'
for ENV_ID in ${ENVS[@]}
do
  CUDA_VISIBLE_DEVICES=`expr $COUNTER % 4` python main.py --env_name $ENV_ID --alg mairlImit --obs_mode state &
  COUNTER=$((COUNTER+1))
done
  1. Evaluation
CUDA_VISIBLE_DEVICES=0 python main.py --env_name Ant-v2 --train_mode False

Acknowledgement

Our code is based on itaicaspi/mgail, HumanCompatibleAI/imitation, huggingface/transformers.

Reference

Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model
Jiankai Sun, Lantao Yu, Pinqian Dong, Bo Lu, and Bolei Zhou
In IEEE Robotics and Automation Letters (RA-L) 2021
[Paper] [Project Page]

@ARTICLE{sun2021adversarial,
     author={J. {Sun} and L. {Yu} and P. {Dong} and B. {L} and B. {Zhou}},
     journal={IEEE Robotics and Automation Letters},
     title={Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model},
     year={2021},
}

About

[RA-L & ICRA 2021] Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model

License:MIT License


Languages

Language:Python 100.0%