Playing Battleships with Deep Reinforcement Learning

Solving the classical board game battleships using a deep reinforcement agent. The goal is to minimize the number of steps to complete the game.

Results

The best agent manages to achieve a mean game length of around 61. Training took around 50 hours on an i7 4770K CPU, where the agent played a total of 3.3 million games. The plot was generated by playing 250 games every 1000 steps in the training of the model, then taking the average of the recorded game lengths. The plots also displays the 25% and 75% percentiles of the game lengths.

Below you can see an animation of the agent performing the task as well as the action probabilities. More animations can be found in the animations folder.

Other attempts include DQN which did not learn anything. I suspect that the reason for this is the highly stochastic nature of the game prohibits the agent to learn a good Q function.

Installation

Clone repository

git clone https://github.com/anklinv/deep_reinforcement_learning_battleships

Install dependencies

cd deep_reinforcement_learning_battleships
pip install -e .
pip install -e ./coding_challenge/

Usage

To load the best model and save an animation of a game

python policy_gradient_agent.py

To train from scratch

python policy_gradient_agent.py --train

To load the latest checkpoint (located in folder models) and continue Training

python policy_gradient_agent.py --train --load

To plot all the timesteps

python policy_gradient_agent.py --plot

To keep the size of the repository small, there is only the recorded best model in the repository. The model saved every 1000 timesteps can be found here.

Folder structure

animations: contains all animation of games performed using the best model
coding_challenge: contains the game environment
data: contains a matrix with all game lengths during the experiment
dqn_agent: failed attempt of using DQN to learn the game
models: contains saved models
plots: contains all plots

anklinv / deep_reinforcement_learning_battleships

Playing Battleships with Deep Reinforcement Learning

Results

Installation

Usage

Folder structure

About

Languages