https://en.wikipedia.org/wiki/Multi-armed_bandit
-
OOP base
-
Epsilon-greedy
-
Epsilon_n-greedy
-
UCB1
-
Softmax
-
Pursuit
-
Reinforcement Comparison
-
Thompson Sampling
-
Play the winner
-
One expert
-
Experts
-
Regret calculation
-
Conversion calculation
-
Many starts