Tic Tac Toe

A tale about trying to train a machine to play Tic Tac Toe through Reinforcement Learning

To run the Jupyter notebooks in Binder press:

The goal of this series is to implement and test a couple of different approaches to training a computer how to play Tic Tac Toe. We will create:

A player that plays completely randomly,
Two players that implement simple forms of the Min-Max algorithm,
Several players that we will train through Reinforcement Learning:
- a Tabular Q-Learning player.
- a Simple Neural Network Q-Learning player.
- a Deep Neural Network Q-learning player.
- a Policy Gradient Descent based player.

I assume you are familiar with:

The rules and basic strategy of playing Tic Tac Toe.
Basic Python 3 programming and use of a Python IDE or Jupyter Notebooks.
At least rudimentary knowledge of Tensorflow and Neural Networks would be helpful, but you might be able to do without (give it a try and if it's too overwhelming do some of the beginner tutorials, and then try again).

Teaching a machine to play tic-tac-toe

Apache License 2.0

Language:HTML 62.3%Language:Jupyter Notebook 35.4%Language:Python 2.3%