MABWiser (IJAIT 2021, ICTAI 2019) is a research library written in Python for rapid prototyping of multi-armed bandit algorithms. It supports context-free, parametric and non-parametric contextual bandit models and provides built-in parallelization for both training and testing components.
The library also provides a simulation utility for comparing different policies and performing hyper-parameter tuning. MABWiser follows a scikit-learn style public interface, adheres to PEP-8 standards, and is tested heavily.
MABWiser is developed by the Artificial Intelligence Center of Excellence at Fidelity Investments. Documentation is available at fidelity.github.io/mabwiser.
# An example that shows how to use the UCB1 learning policy
# to choose between two arms based on their expected rewards.
# Import MABWiser Library
from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy
# Data
arms = ['Arm1', 'Arm2']
decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
rewards = [20, 17, 25, 9]
# Model
mab = MAB(arms, LearningPolicy.UCB1(alpha=1.25))
# Train
mab.fit(decisions, rewards)
# Test
mab.predict()
Available Learning Policies:
- Epsilon Greedy [1, 2]
- LinGreedy [1, 2]
- LinTS [3]
- LinUCB [4]
- Popularity [2]
- Random [2]
- Softmax [2]
- Thompson Sampling (TS) [5]
- Upper Confidence Bound (UCB1) [2]
Available Neighborhood Policies:
- Clusters [6]
- K-Nearest [7, 8]
- LSH Nearest [9]
- Radius [7, 8]
- TreeBandit [10]
MABWiser is available to install as pip install mabwiser
. It can also be installed by building from source by following the instructions in the documentation.
Please submit bug reports and feature requests as Issues.
If you use MABWiser in a publication, please cite it as:
- [IJAIT 2021] E. Strong, B. Kleynhans, and S. Kadioglu, "MABWiser: Parallelizable Contextual Multi-Armed Bandits"
- [ICTAI 2019] E. Strong, B. Kleynhans, and S. Kadioglu, "MABWiser: A Parallelizable Contextual Multi-Armed Bandit Library for Python"
@article{DBLP:journals/ijait/StrongKK21,
author = {Emily Strong and Bernard Kleynhans and Serdar Kadioglu},
title = {{MABWiser:} Parallelizable Contextual Multi-armed Bandits},
journal = {Int. J. Artif. Intell. Tools},
volume = {30},
number = {4},
pages = {2150021:1--2150021:19},
year = {2021},
url = {https://doi.org/10.1142/S0218213021500214},
doi = {10.1142/S0218213021500214},
}
@inproceedings{DBLP:conf/ictai/StrongKK19,
author = {Emily Strong and Bernard Kleynhans and Serdar Kadioglu},
title = {MABWiser: {A} Parallelizable Contextual Multi-Armed Bandit Library for Python},
booktitle = {31st {IEEE} International Conference on Tools with Artificial Intelligence, {ICTAI} 2019, Portland, OR, USA, November 4-6, 2019},
pages = {909--914},
publisher = {{IEEE}},
year = {2019},
url = {https://doi.org/10.1109/ICTAI.2019.00129},
doi = {10.1109/ICTAI.2019.00129},
}
MABWiser is licensed under the Apache License 2.0.
- John Langford and Tong Zhang. The epoch-greedy algorithm for contextual multi-armed bandits
- Volodymyr Kuleshov and Doina Precup. Algorithms for multi-armed bandit problems
- Agrawal, Shipra and Navin Goyal. Thompson sampling for contextual bandits with linear payoffs
- Chu, Wei, Li, Lihong, Reyzin Lev, and Schapire Robert. Contextual bandits with linear payoff functions
- Osband, Ian, Daniel Russo, and Benjamin Van Roy. More efficient reinforcement learning via posterior sampling
- Nguyen, Trong T. and Hady W. Lauw. Dynamic clustering of contextual multi-armed bandits
- Melody Y. Guan and Heinrich Jiang, Nonparametric stochastic contextual bandits
- Philippe Rigollet and Assaf Zeevi. Nonparametric bandits with covariates
- Indyk, Piotr, Motwani, Rajeev, Raghavan, Prabhakar, Vempala, Santosh. Locality-preserving hashing in multidimensional spaces
- Adam N. Elmachtoub, Ryan McNellis, Sechan Oh, Marek Petrik, A practical method for solving contextual bandit problems using decision trees