SuperGo

A student implementation of AlphaGo Zero paper with documentation.

Ongoing project.

TODO (in order of priority)

Statistics (branch statistics)
Game that are longer than the threshold of moves are now used
MCTS
- Tree search
- Dirichlet noise to prior probabilities in the rootnode
- Adaptative temperature (either take max or proportionally)
- Sample random rotation or reflection in the dihedral group
- Multithreading of search
- Batch size evaluation to save computation
Dihedral group of board for more training samples
Learning without MCTS doesnt seem to work
Resume training
GTP on trained models (human.py, to plug with Sabaki)
Learning rate annealing (see this)
Better display for game (viewer.py, converting self-play games into GTP and then using Sabaki)
Make the 3 components (self-play, training, evaluation) asynchronous
Multiprocessing of games for self-play and evaluation
Models and training without MCTS
Evaluation
Tromp Taylor scoring
Dataset ring buffer of self-play games
Loading saved models
Database for self-play games

soon

A student implementation of Alpha Go Zero

Language:Python 100.0%