contextual-bandits

There are 8 repositories under contextual-bandits topic.

VowpalWabbit / vowpal_wabbit
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
active-learning c-plus-plus contextual-bandits cpp learning-to-search machine-learning online-learning reinforcement-learning
Language:C++ 8414
tensorflow / agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
reinforcement-learning tensorflow contextual-bandits bandits multi-armed-bandits tf-agents rl-algorithms dqn
Language:Python 2742
david-cortes / contextualbandits
Python implementations of contextual bandits algorithms
contextual-bandits exploration-exploitation multiarmed-bandits reinforcement-learning
Language:Python 713
st-tech / zr-obp
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
contextual-bandits datasets multi-armed-bandits off-policy-evaluation research
Language:Python 618
fidelity / mabwiser
[IJAIT 2021] MABWiser: Contextual Multi-Armed Bandits Library
contextual-bandits machine-learning multi-armed-bandits non-parametric-bandits parametric-bandits recsys
Language:Python 196
alison-carrera / onn
Online Deep Learning: Learning Deep Neural Networks on the Fly / Non-linear Contextual Bandit Algorithm (ONN_THS)
neural-network neural-architecture-search pytorch-implementation machine-learning-library thompson-sampling thompson-algorithm mab multiarmed-bandits contextual-bandits reinforcement-learning-algorithms reinforcement-learning pytorch pytorch-implemention
Language:Python 171
alison-carrera / mabalgs
:bust_in_silhouette: Multi-Armed Bandit Algorithms Library (MAB) :cop:
multi-armed-bandit mab arm reward thompson-sampling simulation ucb algorithm ranking-algorithm rank ranked-mab monte-carlo montecarlo-simulation contextual-bandits reinforcement-learning reinforcement-learning-algorithms
Language:Python 126
Nth-iteration-labs / contextual
Contextual Bandits in R - simulation and evaluation of Multi-Armed Bandit Policies
contextual bandit simulation statistics multi-armed cmab contextual-bandits bandit-learning bandit-experiments reinforcement-learning reinforcement exploitation exploration evaluation machine-learning multi-armed-bandit contextual-bandit-policies cran offline-bandit multi-armed-bandits
Language:R 79
banditml / banditml
A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
contextual-bandits pytorch personalization neural-networks bandits reinforcement-learning
Language:Python 64
instadeepai / catx
🐈‍⬛ Contextual bandits library for continuous action trees with smoothing in JAX
contextual-bandits jax deep-learning python
Language:Python 61
lil-lab / blocks
Blocks World -- Simulator, Code, and Models (Misra et al. EMNLP 2017)
natural-language-processing machine-learning natural-language-understanding reinforcement-learning contextual-bandits
Language:Python 40
pemami4911 / sinkhorn-policy-gradient.pytorch
Code accompanying the paper "Learning Permutations with Sinkhorn Policy Gradient"
deep-learning combinatorial-optimization permutation-algorithms reinforcement-learning contextual-bandits
Language:Python 38
Heewon-Hailey / multi-armed-bandits-for-recommendation-systems
implement basic and contextual MAB algorithms for recommendation system
python scikit-learn numpy matplotlib multiarmed-bandits recommendation-system epsilon-greedy upper-confidence-bounds contextual-bandits
Language:Jupyter Notebook 28
thunfischtoast / LinUCB
Contextual bandit algorithm called LinUCB / Linear Upper Confidence Bounds as proposed by Li, Langford and Schapire
java bandit-algorithm bandit-learning contextual-bandits linucb
Language:Java 28
doerlbh / MiniVox
Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".
acml bandit-algorithms contextual-bandits interspeech interspeech2020 online-learning online-speaker-diarization paper self-supervised-learning speaker-diarization speaker-recognition
Language:Cuda 25
mmalekzadeh / privacy-preserving-bandits
Privacy-Preserving Bandits (MLSys'20)
bandit-algorithms differential-privacy machine-learning online-machine-learning reinforcement-learning contextual-bandits privacy-preserving-machine-learning privacy-preserving-bandits criteo-dataset federated-learning recommender-system recommendation bandit-learning bandit-algorithm differentially-private
Language:Jupyter Notebook 22
improve-ai / python-ranker
Contextual Multi-Armed Bandit Platform for Scoring, Ranking & Decisions
improve-ai ab-testing ai contextual-bandits machine-learning multi-armed-bandit personalization python recommender-system xgboost reinforcement-learning multivariate-testing
Language:Python 21
RonyAbecidan / Neural-Thompson-Sampling
Study of the paper 'Neural Thompson Sampling' published in October 2020
neural-network neural-tangent-kernel thompson-sampling neural-thompson-sampling multi-armed-bandits contextual-bandits non-linear-optimization
Language:Jupyter Notebook 17
jtcho / FairMachineLearning
Implementation of provably Rawlsian fair ML algorithms for contextual bandits.
machine-learning contextual-bandits multi-armed-bandits python jupyter numpy upenn
Language:Jupyter Notebook 14
thoughtworks / simplebandit
lightweight contextual bandit library for ts/js
bandits contextual-bandits personalization recommendation-system recommender recommender-systems
Language:TypeScript 12
improve-ai / swift-ranker
Easily Score & Rank Codable Objects with ML
improve-ai ai personalization contextual-bandits ios machine-learning objective-c swift reinforcement-learning recommender-system xgboost multi-armed-bandit ab-testing multivariate-testing
Language:Swift 11
travisbrady / ocaml-vw
OCaml bindings to vowpal wabbit
machine-learning ocaml classification vowpal-wabbit reinforcement-learning contextual-bandits
Language:OCaml 10
marlesson / meta-bandit-selector
The Contextual Meta-Bandit (CMB) can be used to select models using the context with online learning based on Reiforcement Learning problem. It's can be used for recommender system ensemble, A/B test, and other dynamic model selector problem.
recsys contextual-bandits machine-learning metalearning
Language:Jupyter Notebook 9
sparsh-ai / reco-bandit
Building recommender Systems using contextual bandit methods to address cold-start issue and online real-time learning
recommender-system bandit-algorithms contextual-bandits
Language:Jupyter Notebook 9
hsm207 / cb-trading
Code to trade the financial markets using Contextual Bandits
trading-bot trading-strategies contextual-bandits reinforcement-learning
Language:Jupyter Notebook 8
Nth-iteration-labs / streamingbandit-ui
Client that handles the administration of StreamingBandit online, or straight from your desktop. Setup and run streaming (contextual) bandit experiments in your browser.
streamingbandit-client react bandit-learning bandit-algorithm contextual-bandits multiarm-bandit javascript client webapp machine-learning
Language:JavaScript 8
improve-ai / tracker-trainer
Contextual Multi-Armed Bandit Item/Reward Tracker & Model Trainer
improve-ai ab-testing ai aws aws-lambda contextual-bandits decision-trees multi-armed-bandit parquet python recommender-system reinforcement-learning serverless serverless-framework xgboost machine-learning ml personalization
Language:Python 7
zaid-g / ccb_tutorial
Contextual multi-armed bandit recommender system using Vowpal Wabbit
contextual-bandits recommender recommender-system reinforcement-learning tutorial
Language:Python 7
hitl-ab-bpm
aaronkurz / hitl-ab-bpm
Business Process Improvement with Reinforcement Learning and Human-in-the-Loop.
bpm bpmn bpi business-process webapp reinforcement-learning ab-testing contextual-bandits ab-tests
Language:Python 6
doerlbh / dilemmaRL
Code for our PRICAI 2022 paper: "Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior".
prisoner-dilemma reinforcement-learning contextual-bandits bandits multiagent-systems multiplayer-game game-theory behavioral-cloning human-behavior machine-learning
Language:Python 6
aldente0630 / multi_armed_bandit
Experiment results using MAB algorithms in Yahoo! Front Page Today Module User Click Log dataset
mab contextual-bandits striatum
Language:Jupyter Notebook 5
jackgerrits / reductionml
Reduction-based machine learning framework with a focus on contextual bandits
data-science machine-learning rust contextual-bandits online-learning
Language:Rust 5
Murtazali05 / LinUCB
LinUCB with disjoint linear models
linucb contextual-bandits
Language:Python 5
ngutowski / algossim
This repository aims at learning most popular MAB and CMAB algorithms and watch how they run. It is interesting for those wishing to start learning these topics.
artificial-intelligence-algorithms bandit-algorithms contextual-bandits recommendation-system
Language:Python 5
saeedghoorchian / NCC-Bandits
Experiments for paper "Online Learning with Costly Features in Non-stationary Environments"
concept-drift contextual-bandits costly-features multi-armed-bandit non-stationary-environment
Language:Jupyter Notebook 5
TheAmazingElys / NeuralBandit
Code of the NeuralBandit paper
contextual-bandits neural-networks reinforcement-learning
Language:Python 5

contextual-bandits

VowpalWabbit / vowpal_wabbit

tensorflow / agents

david-cortes / contextualbandits

st-tech / zr-obp

fidelity / mabwiser

alison-carrera / onn

alison-carrera / mabalgs

Nth-iteration-labs / contextual

banditml / banditml

instadeepai / catx

lil-lab / blocks

pemami4911 / sinkhorn-policy-gradient.pytorch

Heewon-Hailey / multi-armed-bandits-for-recommendation-systems

thunfischtoast / LinUCB

doerlbh / MiniVox

mmalekzadeh / privacy-preserving-bandits

improve-ai / python-ranker

RonyAbecidan / Neural-Thompson-Sampling

jtcho / FairMachineLearning

thoughtworks / simplebandit

improve-ai / swift-ranker

travisbrady / ocaml-vw

marlesson / meta-bandit-selector

sparsh-ai / reco-bandit

hsm207 / cb-trading

Nth-iteration-labs / streamingbandit-ui

improve-ai / tracker-trainer

zaid-g / ccb_tutorial

aaronkurz / hitl-ab-bpm

doerlbh / dilemmaRL

aldente0630 / multi_armed_bandit

jackgerrits / reductionml

Murtazali05 / LinUCB

ngutowski / algossim

saeedghoorchian / NCC-Bandits

TheAmazingElys / NeuralBandit