bandits

There are 0 repository under bandits topic.

tensorflow / agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
bandits contextual-bandits dqn multi-armed-bandits reinforcement-learning rl-algorithms tensorflow tf-agents
Language:Python 2727
yfletberliac / rlss-2019
Materials for the Practical Sessions of the Reinforcement Learning Summer School 2019: Bandits, RL & Deep RL (PyTorch).
tutorial education school reinforcement-learning bandits ipynb materials notebooks google-colab
Language:Jupyter Notebook 85
banditml / banditml
A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
contextual-bandits pytorch personalization neural-networks bandits reinforcement-learning
Language:Python 64
iheartradio / thomas
Another A/B test library
bayesian ab-testing bandit bandits bandit-algorithm bayesian-analysis scala functional-programming functional-reactive-programming public
Language:Scala 23
YRussac / WeightedLinearBandits
Code associated with the NeurIPS19 paper "Weighted Linear Bandits in Non-Stationary Environments"
bandits non-stationary-environment neurips-2019
Language:Jupyter Notebook 17
thoughtworks / simplebandit
lightweight contextual bandit library for ts/js
bandits contextual-bandits recommender personalization recommendation-system recommender-systems
Language:TypeScript 12
annieyan / Bandits-using-UCB-algorithm
Thompson Sampling for Bandits using UCB policy
reinforcement-learning ucb bandits thompson-sampling
Language:Python 10
babaniyi / Deep-contextual-bandits
A benchmark to test decision-making algorithms for contextual-bandits. The library implements a variety of algorithms (many of them based on approximate Bayesian Neural Networks and Thompson sampling), and a number of real and syntethic data problems exhibiting a diverse set of properties.
bandit-algorithms bandits multiarmed-bandits
Language:Python 9
jayeshk7 / RL-Algorithms
Python implementation of common RL algorithms using OpenAI gym environments
reinforcement-learning tabular-q-learning policy-iteration value-iteration sarsa bandits
Language:Python 8
doerlbh / BanditZoo
Python library of bandits and RL agents in different real-world environments
bandit bandit-algorithms bandits reinforcement-learning simulation
Language:Python 7
doerlbh / dilemmaRL
Code for our PRICAI 2022 paper: "Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior".
prisoner-dilemma reinforcement-learning contextual-bandits bandits multiagent-systems multiplayer-game game-theory behavioral-cloning human-behavior machine-learning
Language:Python 6
doerlbh / ABaCoDE
Code for our ICDMW 2018 paper: "Contextual Bandit with Adaptive Feature Extraction".
bandits contextual-bandits feature-extraction icdm icdm2018 nonstationary reinforcement-learning representation-learning
Language:MATLAB 5
DURUII / Replica-AUCB
🐯REPLICA of "Auction-based combinatorial multi-armed bandit mechanisms with strategic arms"
aution bandit-algorithms bandits cmab mab multi-armed-bandit aucb
Language:Python 5
kfoofw / applied_learning_articles
Collaborative project for documenting ML/DS learnings.
bandits causal-inference uplift-modelling
Language:Jupyter Notebook 5
Nicolivain / RLD
Deep Reinforcement Learning Agents in Pytorch in a modular framework
reinforcement-learning deep-reinforcement-learning bandits gym-environment pytorch
Language:Jupyter Notebook 5
anishacharya / Bandits-Online-Learning
Simple Implementations of Bandit Algorithms in python
online-learning online-learning-python online-learning-algorithms bandit bandit-algorithms bandit-learning bandits multi-armed-bandits
Language:Jupyter Notebook 4
doerlbh / BerlinUCB
Code for our AJCAI 2020 paper: "Online Semi-Supervised Learning in Contextual Bandits with Episodic Reward".
bandit bandits contextual-bandit contextual-bandits nonstationary-environments paper reinforcement-learning self-supervised-learning semi-supervised-learning
Language:MATLAB 4
TanguyUrvoy / pmlib
A python library for (finite) Partial Monitoring algorithms
machine-learning multi-armed-bandits dueling-bandits bandits partial-monitoring feedexp3 rex3
Language:Jupyter Notebook 4
manome / python-mab
This project provides a simulation of multi-armed bandit problems. This implementation is based on the below paper. https://arxiv.org/abs/2308.14350.
bandits multi-armed-bandits reinforcement-learning stochastic-bandit-algorithms stochastic-multi-armed-bandits survival-multi-armed-bandits
Language:Python 3
alxthm / rld-project
Play Rock, Paper, Scissors (Kaggle competition) with Reinforcement Learning: bandits, tabular Q-learning and PPO with LSTM.
rl rps-game q-learning ppo bandits
Language:Python 2
gurbaaz27 / amazon-hackathon
amazon hackathon recommendation-system bandits machine-learning
Language:Jupyter Notebook 2
Nicolivain / trustful-bandits
A two armed bandit simulation and comparison with theoritical convergence
bandits bandit-algorithms stochastic-optimization stochastic-algorithms stochastic-algorithm online-optimization trading-agent asset-allocation reinforcement-learning
Language:Jupyter Notebook 2
Ralyhu / CMAB-CC
Code and data for the paper "A Combinatorial Multi-Armed Bandit Approach to Correlation Clustering", DAMI 2023
bandits clustering correlation-clustering
Language:Python 2
sarthakmittal92 / multi-armed-bandits
Repository for the course project done as part of CS-747 (Foundations of Intelligent & Learning Agents) course at IIT Bombay in Autumn 2022.
bandits multi-armed-bandits ucb kl-ucb python reinforcement-learning-algorithms thompson-sampling
Language:Python 2
ElianBelot / bernoulli-bandits
An exploration of multi-armed Bernoulli bandits in reinforcement learning, complete with experiments and observations.
reinforcement-learning bandits
Language:Jupyter Notebook 1
krishnaw14 / CS747-assignments
Foundations of Intelligent and Learning Agenet
reinforcement-learning-algorithms markov-decision-processes bandits
Language:Python 1
MehranTaghian / prophet-inequlity-implementation
Implementation of the prophet inequalities
prophet-inequality k-prophet multi-armed-bandits bandits
Language:Python 1
Zaidtech / OverTheWire
This repo contains all the stuff I encountered while playing OverTheWire games.
overthewire bandits cybersecurity
1
AlxBouras / NeuralRandUCB
Project for the RL course @ Université Laval
bandits neural-bandits
Language:Python 0
BrianHung / random
random python notebooks (hopefully useful in future)
random bandits jupyter
Language:Jupyter Notebook 0
pappar-delle / AI-Labs-2022-23
TJHSST Artificial Intelligence Labs from the 2022-23 School Year with Dr. Gabor
bandits blocks codingbat-solutions crossword-puzzle othello regular-expression sliderpuzzle
Language:Python 0
riccardodv / COOP-learning
Implementation of the experiments for "Cooperative Online Learning with Feedback Graphs" Cesa-Bianchi, Cesari, Della Vecchia (https://arxiv.org/abs/2106.04982)
online-learning-algorithms bandits cooperation
Language:Python 0
rohilrg / Online-Learning-Bandits-Reinforcement-Learning
An assignment for the implementation of Online Learning, Bandits and Reinforcement Learning
online-passive-aggresive-algorithm bandits reinforcement-learning
Language:Jupyter Notebook 0
JoelJa835 / MAB_Algorithms
Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.
bandits e-greedy mab reinforcement-learning-algorithms ucb
Language:Python
philinemey / BSE-T3-RL
Coursework, Stochastic Models and Optimization, BSE, Term 3, Class of 2022
bandits bayesian-optimization dynamic-programming gaussian-processes policy-iteration reinforcement-learning
Language:Jupyter Notebook
XiaoMutt / ucbc
Stanford CS234 Course Side Project
bandits
Language:Python

bandits

tensorflow / agents

yfletberliac / rlss-2019

banditml / banditml

iheartradio / thomas

YRussac / WeightedLinearBandits

thoughtworks / simplebandit

annieyan / Bandits-using-UCB-algorithm

babaniyi / Deep-contextual-bandits

jayeshk7 / RL-Algorithms

doerlbh / BanditZoo

doerlbh / dilemmaRL

doerlbh / ABaCoDE

DURUII / Replica-AUCB

kfoofw / applied_learning_articles

Nicolivain / RLD

anishacharya / Bandits-Online-Learning

doerlbh / BerlinUCB

TanguyUrvoy / pmlib

manome / python-mab

alxthm / rld-project

gurbaaz27 / amazon-hackathon

Nicolivain / trustful-bandits

Ralyhu / CMAB-CC

sarthakmittal92 / multi-armed-bandits

ElianBelot / bernoulli-bandits

krishnaw14 / CS747-assignments

MehranTaghian / prophet-inequlity-implementation

Zaidtech / OverTheWire

AlxBouras / NeuralRandUCB

BrianHung / random

pappar-delle / AI-Labs-2022-23

riccardodv / COOP-learning

rohilrg / Online-Learning-Bandits-Reinforcement-Learning

JoelJa835 / MAB_Algorithms

philinemey / BSE-T3-RL

XiaoMutt / ucbc