This is PyTorch implementation of Constrained Policy Optimization (CPO) [ArXiv].
If you use this code in your research project please cite us as:
@misc{Sikchi_pytorchCPO,
author = {Sikchi, Harshit},
title = {{pytorch-CPO}},
url = {https://github.com/hari-sikchi/pytorch_CPO}
}
- pytorch
- safety_gym
- mpi4py
To train an CPO agent on the pointgoal1
task run:
python cpo.py --env=Safexp-PointGoal1-v0 --cost_lim=<cost threshold> --exp_name=<exp path>
This will produce the exp_path folder, where all the outputs are going to be stored.