konwook / mticl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Learning Shared Safety Constraints from Multi-Task Demonstrations

This project provides the implementation of Learning Shared Safety Constraints from Multi-Task Demonstrations.

If you found this repository useful in your research, please consider citing our paper:

@misc{kim2023learning,
      title={Learning Shared Safety Constraints from Multi-task Demonstrations}, 
      author={Konwoo Kim and Gokul Swamy and Zuxin Liu and Ding Zhao and Sanjiban Choudhury and Zhiwei Steven Wu},
      year={2023},
      eprint={2309.00711},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Table of Contents

The high-level structure of this repository is as follows:

├── mticl  # package folder
│   ├── saferl # safe RL library
│   ├── script # training, testing, plotting scripts
│   ├── ├── train/test_cpo.py # CRL
│   ├── ├── train_baseline.py # Chou et al. baseline
│   ├── ├── train/test_icl.py # Single-task ICL
│   ├── ├── planner_icl.py # Multi-task ICL
│   ├── utils # utility functions
|   ├── demos # generated expert demos
│   ├── plots # plots for the paper
│   ├── experiments # experiments for the paper
├── experts # expert policies trained with CRL
├── learners # learner policies trained with ICL

Note

Please see here for a detailed overview of the codebase.

Setup

Installation

conda create -n mticl python=3.10.8
conda activate mticl
pip install -r requirements.txt
export PYTHONPATH=mticl:$PYTHONPATH

Important

All scripts should be run from under mticl/.

Experiments

Scripts for replicating results from the paper are provided under the experiments/ directory.

AntBulletEnv-v0 (Velocity / Position Constraint)

./experiments/pybullet_baseline.sh
./experiments/pybullet_icl.sh
./experiments/pybullet_noisy.sh

AntMaze_UMazeDense-v3 (Maze Constraint)

./experiments/ant_maze.sh

Acknowledgements

The core saferl/ code was developed from an early version of the FSRL repository which provides fast, high-quality implementations of safe RL algorithms.

Warning

This code is not fully compatible with the latest version of FSRL. See here for more information.

About


Languages

Language:Python 95.9%Language:Shell 4.1%