multi-armed-bandits

There are 6 repositories under multi-armed-bandits topic.

tensorflow / agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
reinforcement-learning tensorflow contextual-bandits bandits multi-armed-bandits tf-agents rl-algorithms dqn
Language:Python 2744
st-tech / zr-obp
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
datasets off-policy-evaluation contextual-bandits multi-armed-bandits research
Language:Python 621
fidelity / mabwiser
[IJAIT 2021] MABWiser: Contextual Multi-Armed Bandits Library
multi-armed-bandits contextual-bandits parametric-bandits non-parametric-bandits machine-learning recsys
Language:Python 198
rlberry-py / rlberry
An easy-to-use reinforcement learning library for research and education.
reinforcement-learning reinforcement-learning-algorithms reinforcement-learning-environments multi-armed-bandits
Language:Python 159
fidelity / mab2rec
[AAAI 2024] Mab2Rec: Multi-Armed Bandits Recommender
multi-armed-bandits recommendation recsys
Language:Jupyter Notebook 103
antonismand / Personalized-News-Recommendation
Multi Armed Bandits implementation using the Yahoo! Front Page Today Module User Click Log Dataset
linucb multi-armed-bandits python recommendation-system reinforcement-learning
Language:Jupyter Notebook 95
Nth-iteration-labs / contextual
Contextual Bandits in R - simulation and evaluation of Multi-Armed Bandit Policies
contextual bandit simulation statistics multi-armed cmab contextual-bandits bandit-learning bandit-experiments reinforcement-learning reinforcement exploitation exploration evaluation machine-learning multi-armed-bandit contextual-bandit-policies cran offline-bandit multi-armed-bandits
Language:R 79
bayesianbandits / bayesianbandits
A Pythonic microframework for multi-armed bandit problems
bayesian-statistics multi-armed-bandits reinforcement-learning
Language:Python 72
mab
stitchfix / mab
Library for multi-armed bandit selection strategies, including efficient deterministic implementations of Thompson sampling and epsilon-greedy.
multiarmed-bandits golang go experimentation data-science reinforcement-learning thompson-sampling multi-armed-bandit multi-armed-bandits thompson
Language:Go 48
kulinshah98 / Multi-Armed-Bandit-Algorithms
Python implementation of UCB, EXP3 and Epsilon greedy algorithms
multi-armed-bandits bandit-algorithms stochastic-bandit-algorithms upper-confidence-bounds epsilon-greedy adversarial-bandit-algorithms exp3-algorithm
Language:Python 27
wbwang2020 / MP-MAB
This project is created for the simulations of the paper: [Wang2021] Wenbo Wang, Amir Leshem, Dusit Niyato and Zhu Han, "Decentralized Learning for Channel Allocation inIoT Networks over Unlicensed Bandwidth as aContextual Multi-player Multi-armed Bandit Game", to appear in IEEE Transactions on Wireless Communications, 2021.
machine-learning-algorithms multi-armed-bandits multi-agent-reinforcement-learning
Language:Python 24
ir-uam / kNNBandit
Software for the experiments reported in the RecSys 2019 paper "A Simple Multi-Armed Nearest-Neighbor Bandit for Interactive Recommendation"
recommender-systems reinforcement-learning multi-armed-bandits knn recsys2019
Language:Java 21
cfoh / Multi-Armed-Bandit-Example
Learning Multi-Armed Bandits by Examples. Currently covering MAB, UCB, Boltzmann Exploration, Thompson Sampling, Contextual MAB.
reinforcement-learning machine-learning multi-armed-bandits recommendation-system
Language:Python 19
Kenza-AI / mab-ranking
Online Ranking with Multi-Armed-Bandits
mab-ranking multi-armed-bandits online-learning-algorithms online-learning learning-to-rank online-learning-to-rank
Language:Python 18
nphdang / Bandit-BO
Bayesian Optimization for Categorical and Continuous Inputs
bayesian-optimization categorical-variables continuous-variable thompson-sampling gaussian-processes automated-machine-learning hyperparameter-optimization batch-bayesian-optimization acquisition-functions multi-armed-bandits machine-learning hyperparameter-tuning gpyopt hyperopt smac automl optimization
Language:Python 18
RonyAbecidan / Neural-Thompson-Sampling
Study of the paper 'Neural Thompson Sampling' published in October 2020
neural-network neural-tangent-kernel thompson-sampling neural-thompson-sampling multi-armed-bandits contextual-bandits non-linear-optimization
Language:Jupyter Notebook 18
xuedong / machine-learning-summer-schools
Curated materials for different machine learning related summer schools
machine-learning reinforcement-learning deep-learning statistics variational-inference optimization deep-reinforcement-learning optimal-transport automatic-differentiation probabilistic-graphical-models causal-inference generative-adversarial-network natural-language-processing kernel-methods multi-armed-bandits multi-agent gaussian-processes mcmc transfer-learning interpretability
Language:Jupyter Notebook 18
akshaykhadse / reinforcement-learning
Implementations of basic concepts dealt under the Reinforcement Learning umbrella. This project is collection of assignments in CS747: Foundations of Intelligent and Learning Agents (Autumn 2017) at IIT Bombay
reinforcement-learning reinforcement-learning-excercises reinforcement-learning-analysis multi-armed-bandits multiarm-bandit markovian-epidemic-processes mdps ucb ucb1 kl-divergence epsilon-greedy thompson-sampling linear-programming howards-pi policy-iteration policy-evaluation batch-switching randomised-algorithms randomized-policy-iteration
Language:Python 17
IoT-MAB
tuyenta / IoT-MAB
Decentralized Intelligent Resource Allocation for LoRaWAN Networks
lorawan multi-armed-bandits resource-allocation
Language:Python 16
adik993 / reinforcement-learning-sutton
reinforcement-learning sutton-book cliffwalking gridworld sarsa q-learning dyna-q td-lambda random-walk bandit-algorithm multi-armed-bandits racecar
Language:Python 14
jtcho / FairMachineLearning
Implementation of provably Rawlsian fair ML algorithms for contextual bandits.
machine-learning contextual-bandits multi-armed-bandits python jupyter numpy upenn
Language:Jupyter Notebook 14
beer-recommender-mab
paulozip / beer-recommender-mab
A beer recommendation system using multi-armed bandit approach to solve cold start problems
multiarmed-bandits python recommendation-system multi-armed-bandits
Language:Python 12
rotationalio / honu
Adaptive consistency replication with reinforcement learning for large scale globally distributed storage.
distributed-storage distributed-systems gossip-protocol multi-armed-bandits adaptive-consistency
Language:Go 12
releaunifreiburg / DyHPO
[NeurIPS 2022] Supervising the Multi-Fidelity Race of Hyperparameter Configurations
convolutional-neural-networks deep-kernel-learning deep-learning gaussian-processes grey-box hpo hyperparameter-optimization multi-armed-bandits multi-fidelity neural-networks deep-kernel-gps neurips-2022 state-of-the-art
Language:Python 10
ardaegeunlu / X-armed-Bandits
Implementation of the X-armed Bandits algorithm, as detailed in the paper, "X-armed Bandits", Bubeck et al., 2011.
reinforcement-learning machine-learning-algorithms multi-armed-bandit multi-armed-bandits reinforcement-learning-algorithms
Language:Python 9
ardaegeunlu / Non-Stochastic-Bandit-Slate-Algorithms
Implementations of the bandit algorithms with unordered and ordered slates that are described in the paper "Non-Stochastic Bandit Slate Problems", by Kale et al. 2010.
machine-learning reinforcement-learning multi-armed-bandits multi-armed-bandit machine-learning-algorithms machine-learning-models
Language:Python 8
kkm24132 / ReinforcementLearning
Focuses on Reinforcement Learning related concepts, use cases, and learning approaches
reinforcement-learning exploration-exploitation multi-armed-bandits montecarlo temporal-difference-algorithms sarsa q-learning policy-gradient linear-function-approximation
Language:Jupyter Notebook 8
nphdang / turbo_bbo_neurips_2020
An improved version of Turbo algorithm for the Black-box optimization competition organized by NeurIPS 2020
bayesian-optimization thompson-sampling gaussian-processes automated-machine-learning hyperparameter-optimization batch-bayesian-optimization decay acquisition-functions multi-armed-bandits machine-learning hyperparameter-tuning classification turbo
Language:Python 8
mabby
thetawom / mabby
A multi-armed bandit (MAB) simulation library in Python
multi-armed-bandits probability python reinforcement-learning simulation agent-based-simulation artificial-intelligence epsilon-greedy thompson-sampling
Language:Python 8
Shahul-Rahman / SPGD-Search-Party-Gradient-Descent-algorithm
SPGD: Search Party Gradient Descent algorithm, a Simple Gradient-Based Parallel Algorithm for Bound-Constrained Optimization. Link: https://www.mdpi.com/2227-7390/10/5/800
metaheuristics multi-armed-bandits optimization optimization-algorithms reinforcement-learning blackbox-optimization python python-optimization bio-inspired-optimization nature-inspired-algorithms optimization-methods
Language:Jupyter Notebook 7
ardaegeunlu / Explore-Exploit-in-Top-N-Recommender-Systems-via-Gaussian-Processes
MATLAB Implementation of the CGPRANK algorithm
reinforcement-learning machine-learning multi-armed-bandits recommender-system gaussian-processes ranking-algorithm web-scale
Language:MATLAB 6
FlynnOwen / multi-armed-bandits
Multi-Armed Bandit method of accurately estimating the largest parameter out of a set of candidates.
machine multi-armed-bandit multi-armed-bandits python reinforcement-learning
Language:Python 6
banyikun / LOCB
Code for "Local Clustering in Contextual Multi-Armed Bandits".
multi-armed-bandits online-clustering recommender-system
Language:Python 5
jzsherlock4869 / reinforcement-learning-sutton-code
Implementations of methods in book <Reinforcement Learning: an introduction> by Sutton Barto, using Python.
markov-decision-process multi-armed-bandits reinforcement-learning
Language:Python 5
nicoleorzan / Multi-armed-bandit-RL
C++ implementation of Multi-Armed Bandits (Gaussian and Bernoulli)
multi-armed-bandits reinforcement-learning softmax-policy bernoulli-bandit gaussian-bandit softmax ucb bandit-algorithms
Language:C++ 5
yubin1219 / multi_armed_bandits_recommendation_system
Reinforcement learning project using multi-armed bandits for recommendation system
multi-armed-bandits recommendation-system reinforcement-learning
Language:Jupyter Notebook 5

multi-armed-bandits

tensorflow / agents

st-tech / zr-obp

fidelity / mabwiser

rlberry-py / rlberry

fidelity / mab2rec

antonismand / Personalized-News-Recommendation

Nth-iteration-labs / contextual

bayesianbandits / bayesianbandits

stitchfix / mab

kulinshah98 / Multi-Armed-Bandit-Algorithms

wbwang2020 / MP-MAB

ir-uam / kNNBandit

cfoh / Multi-Armed-Bandit-Example

Kenza-AI / mab-ranking

nphdang / Bandit-BO

RonyAbecidan / Neural-Thompson-Sampling

xuedong / machine-learning-summer-schools

akshaykhadse / reinforcement-learning

tuyenta / IoT-MAB

adik993 / reinforcement-learning-sutton

jtcho / FairMachineLearning

paulozip / beer-recommender-mab

rotationalio / honu

releaunifreiburg / DyHPO

ardaegeunlu / X-armed-Bandits

ardaegeunlu / Non-Stochastic-Bandit-Slate-Algorithms

kkm24132 / ReinforcementLearning

nphdang / turbo_bbo_neurips_2020

thetawom / mabby

Shahul-Rahman / SPGD-Search-Party-Gradient-Descent-algorithm

ardaegeunlu / Explore-Exploit-in-Top-N-Recommender-Systems-via-Gaussian-Processes

FlynnOwen / multi-armed-bandits

banyikun / LOCB

jzsherlock4869 / reinforcement-learning-sutton-code

nicoleorzan / Multi-armed-bandit-RL

yubin1219 / multi_armed_bandits_recommendation_system