Status: Archive (code is provided as-is, no updates expected)

Generative Adversarial Imitation Learning

Jonathan Ho and Stefano Ermon

Contains an implementation of Trust Region Policy Optimization (Schulman et al., 2015).

Dependencies:

Provided files:

expert_policies/* are the expert policies, trained by TRPO (scripts/run_rl_mj.py) on the true costs
scripts/im_pipeline.py is the main training and evaluation pipeline. This script is responsible for sampling data from experts to generate training data, running the training code (scripts/imitate_mj.py), and evaluating the resulting policies.
pipelines/* are the experiment specifications provided to scripts/im_pipeline.py
results/* contain evaluation data for the learned policies

Code for the paper "Generative Adversarial Imitation Learning"

MIT License

Language:Python 100.0%