touchtop / gt_rl_course

Resources and material for an internal course on Reinforcement Learning.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Course outline

Week -1: Course outline

Slides NL Getting Python / conda virtual environment up and running

Week 0: Programming with Python

Datacamp cursus: Introduction to Python (including numpy)

  • Object oriented programming a.k.a. Classes

Follow this tutorial:

Extra info on Inheritance:

Week 1: Introduction to RL

Week 2: Multi-armed bandits

Bandits are MDP with just one state. Example: pick an advertisement to show, reward when clicked. Example: pick a market, reward is units sold in a market.

  • Read second chapter "Multi armed bandits" of Sutton & Barto

  • Exercise: work through the OpenAI Gym tutorial

  • Exercise: Bandits_in_gym Here we code up the simple bandit algorithm of p 32 in Sutton & Barto, as well as the UCB variant.

Week 3: Theory: Markov Decision Processes (MDPs)

Week 4: Dynamic Programming (DP)

Week 5: Monte Carlo (MC) control

  • Read selected paragraphs from Chapter 5

  • Exercise: Udacity Notebook for solving the BlackJack env using MC control.

Week 6: Q-learning

  • Read selected paragraphs from Chapter 6

  • Exercise: Udacity Notebook on temporal difference (TD) methods (CliffWalking environment).

Week 7: Economic application of Q-learning: algorithmic pricing

Week 8: Programming multi-agent RL using PettingZoo

About

Resources and material for an internal course on Reinforcement Learning.

License:MIT License


Languages

Language:Jupyter Notebook 68.0%Language:Python 31.6%Language:Shell 0.3%