davnn / deep_pommerman

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deep Learning Pommerman Project

For this project, we are attempting to create our own reinforcement learning agents that will be able to apply teamwork in a novel way.

Agent Design

Much of the code is based on the Skynet955 agent that placed 5th in NeurIPS in 2018. https://hub.docker.com/r/multiagentlearning/skynet955

https://www.borealisai.com/en/blog/pommerman-team-competition-or-how-we-learned-stop-worrying-and-love-battle/

https://github.com/BorealisAI/pommerman-baseline

Getting started

Installation

  1. Install CUDA and Nvidia Drivers (if you have an NVIDIA GPU)

  2. Clone the repo

  3. Create a virtual or conda environment

  4. Source your environment. this will vary depending on your environment; for virtualenv in linux:

    source venv/bin/activate 
  5. Install dependencies

    pip3 install -r requirements.txt
    pip3 install -r requirements_extra.txt
  6. Install stable baselines dependencies (non-pip requirements): https://github.com/hill-a/stable-baselines#installation

  7. Install code (must be done everytime pommerman libraries are changed):

    python3 setup.py build
    python3 setup.py install
    python3 setup.py install_lib
  8. Test it:

    python3 examples/simple_ffa_run.py 

Training

If you want to train the skynet model, follow these directions

  1. Modify params file for your setup (be sure to set start_iteration to 0 if starting with an untrained network).

  2. Add CUDA devices to your environment variables if you have any (comma separated):

    echo export CUDA_VISIBLE_DEVICES=0 >> ~/.bashrc
  3. If you do not have CUDA, remove the --device_id argument from train.sh script.

  4. Run training with set params:

    source train.sh params log.txt 
  5. Training will run for a while and be stored in nn_model_dir

  6. Modify examples/simple_team_run_CNNskynet.py to view your trained agent as of last stored checkpoint and then test it:

    python3 examples/simple_team_run_CNNskynet.py 

Playground Info

First time? check out our website for more information, our Discord to join the community, or read the documentation to get started.

Playground hosts Pommerman, a clone of Bomberman built for AI research. People from around the world submit agents that they've trained to play. We run regular competitions on our servers and report the results and replays.

There are three variants for which you can enter your agents to compete:

  • FFA: Free For All where four agents enter and one leaves. It tests planning, tactics, and cunning. The board is fully observable.
  • Team (The NIPS '18 Competition environment): 2v2 where two teams of agents enter and one team wins. It tests planning, and tactics, and cooperation. The board is partially observable.
  • Team Radio: Like team in that a it's a 2v2 game. Differences are that the agents each have a radio that they can use to convey 2 words from a dictionary of size 8 each step.

How do I train agents?

Most open-source research tools in this domain have been designed with single agents in mind. We will be developing resources towards standardizing multi-agent learning. In the meantime, we have provided an example training script in train_with_tensorforce.py. It demonstrates how to wrap the Pommerman environments such that they can be trained with popular libraries like TensorForce.

How do I submit agents that I have trained?

The setup for submitting agents will be live shortly. It involves making a Docker container that runs your agent. We then read and upload your docker file via Github Deploy Keys. You retain the ownership and license of the agents. We will only look at your code to ensure that it is safe to run, doesn't execute anything malicious, and does not cheat. We are just going to run your agent in competitions on our servers. We have an example agent that already works and further instructions are in the games/a/docker directory.

Original codebase

Find the orignal codebase we forked from here.

Citation

Since we are using Pommerman environment in our research, we cite it using this bibtex file in docs.

About

License:Apache License 2.0


Languages

Language:Python 100.0%Language:Shell 0.0%