Solving the classical board game battleships using a deep reinforcement agent. The goal is to minimize the number of steps to complete the game.
The best agent manages to achieve a mean game length of around 61. Training took around 50 hours on an i7 4770K CPU, where the agent played a total of 3.3 million games. The plot was generated by playing 250 games every 1000 steps in the training of the model, then taking the average of the recorded game lengths. The plots also displays the 25% and 75% percentiles of the game lengths.
Below you can see an animation of the agent performing the task as well as the action probabilities. More animations can be found in the animations folder.
Other attempts include DQN which did not learn anything. I suspect that the reason for this is the highly stochastic nature of the game prohibits the agent to learn a good Q function.
Clone repository
git clone https://github.com/anklinv/deep_reinforcement_learning_battleships
Install dependencies
cd deep_reinforcement_learning_battleships
pip install -e .
pip install -e ./coding_challenge/
To load the best model and save an animation of a game
python policy_gradient_agent.py
To train from scratch
python policy_gradient_agent.py --train
To load the latest checkpoint (located in folder models) and continue Training
python policy_gradient_agent.py --train --load
To plot all the timesteps
python policy_gradient_agent.py --plot
To keep the size of the repository small, there is only the recorded best model in the repository. The model saved every 1000 timesteps can be found here.
- animations: contains all animation of games performed using the best model
- coding_challenge: contains the game environment
- data: contains a matrix with all game lengths during the experiment
- dqn_agent: failed attempt of using DQN to learn the game
- models: contains saved models
- plots: contains all plots