AlphaDraughts-Zero

This repository is the final project for Reinforcement Learning - M2 MVA 2018. Inspired by AlphaGo Zero, we apply this method on English Checkers, a famous strategy board games for two players.

Requirements

Linux or macOS
Python 3, version 3.4 or later is preferred
PyTorch 1.0
CPU or NVIDIA GPU + CUDA CuDNN

For pip users, run pip install -r requirements.txt to install dependencies.

Getting Started

Installation

Clone this repo:

git clone https://github.com/Tong-ZHAO/AlphaDraughts-Zero
cd AlphaDraughts-Zero
pip install -r requirements.txt

Train

The training parameters should be specified in ./src/config.py beforehand.
Some parameters can be passed to ./src/train.py as arguments:

usage: train.py [-h] [--iterations N] [--lr LR] [--seed S] [--env ENV]

Training of AlphaDraughts Zero

optional arguments:
  -h, --help      show this help message and exit
  --iterations N  number of iterations of pipeline training)
  --lr LR         learning rate (default: 0.01)
  --seed S        random seed (default: 42)
  --env ENV       visdom environment

To start the training:

cd src
python train.py

To view loss plots, run python -m visdom.server and click the URL http://localhost:8097.
To see more details of training, check the log file in ./logs/.

Test

To start the human-machine competition (qualitative evaluation):

cd src
python gui.py

Some arguments could be passed to ./src/gui.py:

usage: gui.py [-h] [--checkpoint C] [--human H] [--simulation S] [--ai A]

optional arguments:
  -h, --help      show this help message and exit
  --checkpoint C  which neural network model checkpoint to use.
  --human H       "white" or "black", which side human player plays, white
                  side always goes first.
  --simulation S  number of simulations for MCTS at each time step to choose
                  the action.
  --ai A          whether use AI, 1 means using AI, 0 means not using AI.

Quantitative evaluation could be done using functions provided in ./src/elo.py.

Tong-ZHAO / AlphaDraughts-Zero