anklinv / deep_reinforcement_learning_battleships

Using Deep Reinforcement Learning to play battleships for an ETH Seminar

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Playing Battleships with Deep Reinforcement Learning

Solving the classical board game battleships using a deep reinforcement agent. The goal is to minimize the number of steps to complete the game.

Results

The best agent manages to achieve a mean game length of around 61. Training took around 50 hours on an i7 4770K CPU, where the agent played a total of 3.3 million games. The plot was generated by playing 250 games every 1000 steps in the training of the model, then taking the average of the recorded game lengths. The plots also displays the 25% and 75% percentiles of the game lengths.

plot

Below you can see an animation of the agent performing the task as well as the action probabilities. More animations can be found in the animations folder.

animation

Other attempts include DQN which did not learn anything. I suspect that the reason for this is the highly stochastic nature of the game prohibits the agent to learn a good Q function.

Installation

Clone repository

git clone https://github.com/anklinv/deep_reinforcement_learning_battleships

Install dependencies

cd deep_reinforcement_learning_battleships
pip install -e .
pip install -e ./coding_challenge/

Usage

To load the best model and save an animation of a game

python policy_gradient_agent.py

To train from scratch

python policy_gradient_agent.py --train

To load the latest checkpoint (located in folder models) and continue Training

python policy_gradient_agent.py --train --load

To plot all the timesteps

python policy_gradient_agent.py --plot

To keep the size of the repository small, there is only the recorded best model in the repository. The model saved every 1000 timesteps can be found here.

Folder structure

  • animations: contains all animation of games performed using the best model
  • coding_challenge: contains the game environment
  • data: contains a matrix with all game lengths during the experiment
  • dqn_agent: failed attempt of using DQN to learn the game
  • models: contains saved models
  • plots: contains all plots

About

Using Deep Reinforcement Learning to play battleships for an ETH Seminar

License:MIT License


Languages

Language:Python 100.0%