Reinforcement learning solution for the N armed bandit problem
Inspired by the book "Reinforcement Learning" by Sutton and Barto, this is a suggested solution for the simple stationary N armed bandit problem.
Will easily %run from IPython.
Example for running 200-Arm Bandit for various configurations