JCK-1096 / Bandit-and-Reinforcement-Learning

Python implementation for Reinforcement Learning algorithms -- Bandit algorithms, MDP, Dynamic Programming (value/policy iteration), Model-free Control (off-policy Monte Carlo, Q-learning)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

JCK-1096/Bandit-and-Reinforcement-Learning Stargazers