Menace: The Machine Educable Noughts And Crosses Engine
Teaching a bunch of matchboxes how to play tic-tac-toe using Reinforcement Learning
Menace “learns” to play noughts and crosses by playing the game repeatedly against another player, each time refining its strategy until after having played a certain number of games it becomes almost perfect and its opponent is only able to draw or lose against it. The learning process involves being “punished” for losing and “rewarded” for drawing or winning.