sprkrd / pddlgym

Convert a PDDL domain into an OpenAI Gym environment.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Build Status

PDDLGym: PDDL → OpenAI Gym

Sokoban example

This library is under development by Tom Silver and Rohan Chitnis. Correspondence: tslvr@mit.edu and ronuchit@mit.edu.

Paper

Please see our paper describing the design decisions and implementation details behind PDDLGym.

Status

We support the following subset of PDDL1.2:

  • STRIPS
  • Typing (including hierarchical)

Notable features that we do not currently support include equality (blocked by: parsing and inference), conditional effects (blocked by: inference), disjunction and quantification (blocked by: inference).

Several PDDL environments are included, such as:

  • Sokoban
  • Depot
  • Blocks
  • Keys and Doors
  • Towers of Hanoi
  • "Minecraft"
  • "Rearrangement"
  • "Travel"
  • "Baking"

(Environments in quotes indicate ones that we made up ourselves. Unquoted environments are standard ones whose PDDL files are available online, with light modifications to support our interface.)

We also support probabilistic effects, specified in the PPDDL syntax. Several PPDDL environments are included, such as:

  • River
  • Triangle Tireworld
  • Exploding Blocks

Please get in touch if you are interested in contributing!

Sister packages: pyperplan and rddlgym.

Installation

Installing via pip

pip install pddlgym

Installing from source (if you want to make changes to PDDLGym)

First, set up a virtual environment with Python 3. For instance, if you use virtualenvwrapper, you can simply run mkvirtualenv --python=`which python3` pddlgymenv. Next, clone this repository, and from inside it run pip install -e .. Now you should able to run the random agent demos in pddlgym/demo.py. You should also be able to import pddlgym from any Python shell.

Planner dependencies (optional)

To be able to run the planning demos in pddlgym/demo.py, install Fast-Forward. Set the environment variable FF_PATH to point to the ff executable (note: the executable itself, not just the directory containing the executable), wherever you install it. MAC USERS: you may want to install Rohan's patch instead of the previous link.

Usage examples

Hello, PDDLGym

import gym
import pddlgym
import imageio

env = gym.make("PDDLEnvSokoban-v0")
obs, debug_info = env.reset()
img = env.render()
imageio.imsave("frame1.png", img)
action = env.action_space.sample(obs)
obs, reward, done, debug_info = env.step(action)
img = env.render()
imageio.imsave("frame2.png", img)

Plan with FastForward

import gym
import pddlgym
from pddlgym.utils import run_planning_demo

# See `pddl/sokoban.pddl` and `pddl/sokoban/problem3.pddl`.
env = gym.make("PDDLEnvSokoban-v0")
env.fix_problem_index(2)
run_planning_demo(env, 'ff', verbose=True)

Observation representation

As in OpenAI Gym, calling env.reset() or env.step() will return an observation of the environment. This observation is a namedtuple with 3 fields: obs.literals gives a frozenset of literals that hold true in the state, obs.objects gives a frozenset of objects in the state, and obs.goal gives a pddlgym.structs.Literal object representing the goal of the current problem instance.

Adding a new domain

Step 1: Adding PDDL files

Create a domain PDDL file and one or more problem PDDL files. (Note: Only a certain subset of PDDL is supported right now -- see "Status" above.) Put the domain file in pddl/ and the problem files in pddl/<domain name>. Make sure that the name of your new domain is consistent and is used for the domain pddl filename and the problem directory.

Step 2 (optional): Implement rendering

  • Implement a render function in a new file in rendering/. For an example, see pddlgym/rendering/rearrangement.py. See the Observation representation section for a description of the representation of the argument obs passed into the render function. Update pddlgym/rendering/__init__.py to import your new function.

Step 3: Register Gym environment

  • Update the list in pddlgym/__init__.py to register your new environment. There are several methods for doing so:

Simple (recommended if you want to spin up quickly with off-the-shelf PDDL files)

Let's say your domain name is "mypddlgymenv" and your render function is mypddlgymenv_render. Then you would add to the list the following entry: ('mypddlgymenv', {'render': mypddlgymenv_render, 'operators_as_actions': True, 'dynamic_action_space': True}). You can leave out the "render" entry if you don't have a render function.

  • What these arguments mean: by default, PDDLGym requires modifying the PDDL files to make a distinction between "actions" and "operators", related to the boundary between agent and environment. The rationale is described in Section 2.2 of our paper. Setting "operators_as_actions" to True eliminates this distinction, and makes it so you can use off-the-shelf PDDL files without modification. Setting "dynamic_action_space" to True causes env.action_space to change on each iteration to include only valid actions (those that match the operator preconditions), which can be useful in, for example, policy learning.

More complex (recommended for more serious research)

If you plan to use PDDLGym for non-trivial domains, you will almost certainly need to make the distinction between operators and actions, by letting "operators_as_actions" be False (the default) for your new domain in pddlgym/__init__.py. Actions are the things passed from the agent to the environment, like motor commands on a robot. Operators describe the environmental consequences of the agent's actions. For instance, a moveto command may only be parameterized by a target pose from the perspective of the agent, but internally to the environment, it must also be parameterized by the current pose because a literal must be created specifying that the agent is no longer at this current pose. In order to handle this, you will need to update your PDDL files by including special predicates called "action predicate". Action predicates must be incorporated in four places:

  1. Alongside the typical predicate declarations in the domain file.
  2. In a space-separated list of format ; (:actions <action predicate name 1> <action predicate name 2> ...) in the domain file. (Note the semicolon at the beginning!)
  3. One variable-grounded action predicate should appear in the preconditions of every operator in the domain file.
  4. In each problem file, all possible ground actions should be listed alongside the other :init declarations.

See pddlgym/pddl/blocks.pddl and pddlgym/pddl/blocks/problem1.pddl for an example to follow, where there are four action predicates: pickup, putdown, stack, and unstack.

Citation

Please use this bibtex if you want to cite this repository in your publications:

@misc{silver2020pddlgym,
    title={PDDLGym: Gym Environments from PDDL Problems},
    author={Tom Silver and Rohan Chitnis},
    year={2020},
    eprint={2002.06432},
    archivePrefix={arXiv},
    primaryClass={cs.AI}
}

About

Convert a PDDL domain into an OpenAI Gym environment.


Languages

Language:Python 100.0%