We study the problem of best arm identification in multi-armed bandits, where the problem setting can be either stochastic or adversarial and is not revealed to the forecaster a priori. We propose a universal best arm identification algorithm, called S3-BA, that returns the best arm without knowing the underlying model. We evaluate the performance of various algorithms, via numerical experiments. Two synthetic datasets and one real-world dataset of the US stock market are used in the simulations.
2018 IEEE Information Theory Workshop (ITW); Hantao Zhang, Cong Shen .
Link: https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?&filter=issueId%20EQ%20%228613300%22&searchWithin=Best%20Arm%20Identification%20for%20Both%20Stochastic%20and%20Adversarial%20Multi-armed%20Bandits&pageNumber=1&resultAction=REFINE