keirp / glamor

Code for "Planning from Pixels using Inverse Dynamics Models"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Goal-Conditioned Latent Action MOdels for RL (GLAMOR)

Setup

Create the Conda environment:

conda env create -f environment.yaml

Install additional dependencies:

pip install -e . dependencies/rlpyt git+git://github.com/mila-iqia/atari-representation-learning.git

Example

To train GLAMOR on Atari with default hyperparameters (same as used in the paper), use:

python main.py train_glamor_atari --use_wandb=False --run_path='runs/'

There is also an included notebook that can be used to train GLAMOR on a GridWorld task.

Structure

  • glamor
    • algos
      • batch_supervised.py (nn training loop)
      • batch_train_glamor.py (main algo logic)
    • datasets
      • frame_buffer.py (replay buffer that only stores each frame once in memory)
      • k_dist.py (code for sampling sequence lengths during training)
      • replay_buffer.py (uniform replay buffer)
    • envs (contains Atari, DM Control Suite, and GridWorld envs)
    • eval
      • label_compare_eval.py (evaluates policy in an env and returns statistics about achieved goals based on labels)
      • policy_video_eval.py (records videos of policies)
    • models
      • atari (pre-processing for Atari models)
      • basic (basic nn blocks)
      • encoder_lstm_model.py (main model class)
    • planner (contains the planning code)
    • policies (different policies like random, open and closed loop policies based on a plan, and eps-greedy)
    • samplers (code for sampling trajectories from the environment using a policy)
    • tasks (code for generating and sampling from task distributions)
    • train
      • scripts.py (main entry point, contains argument definitions)

Todo

  • Remove dependency on rlpyt and support normal gym environments.
  • Rewrite replay buffers to support non-visual goals.
  • Multi-processing for trajectory collection.

Bibtex

@article{paster2020planning,
title={Planning from Pixels using Inverse Dynamics Models}, 
author={Keiran Paster and Sheila A. McIlraith and Jimmy Ba},
year={2020},
eprint={2012.02419},
archivePrefix={arXiv},
primaryClass={cs.LG}
}

About

Code for "Planning from Pixels using Inverse Dynamics Models"


Languages

Language:Python 83.7%Language:Jupyter Notebook 16.3%