davide-belli / reinforcement-learning-labs-hws

Labs and Homeworks for Reinforcement Learning course, MSc AI @ UvA 2018/2019

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Labs and Homeworks for Reinforcement Learning course, MSc AI @ UvA 2018/2019.

License: MIT

Solutions and implementation from Davide Belli and Gabriele Cesa.
Total Grade: 102.75 %

Lab 1: Tabular Solutions Methods (grade 24.5/26)

Material covered in the lectures and in Sutton and Barto, chapters 2-7

Topics:

  • Policy Evaluation
  • Policy Iteration
  • Value Iteration
  • Monte Carlo Prediction
  • Monte Carlo Control
  • Temporal Difference learning
  • Q-learning
  • Sarsa

Environments:

  • GridWorld
  • Blackjack
  • WindyGridWorld

Lab 2: Approximate Solutions Methods (grade: 31/25)

Material covered in the lectures and in Sutton and Barto, chapters 9-13

Topics:

  • Deep Q-Network
  • Experience Replay
  • Semi-Gradient vs True Gradient
  • Policy Gradient
  • Policy Network
  • MC Reinforce
  • Actor-Critic
  • Deep Reinforcement Learning + REINFORCE
  • Deep Reinforcement Learning + Actor-Critic

Environments:

  • CartPole
  • CartPoleRaw (raw image observations)

Homework 1 (grade: 19.5/20)

Topics:

  • Introduction to RL
  • Exploration
  • MDP
  • Dynamic Programming
  • Monte Carlo
  • Temporal Difference learning (theory)
  • Temporal Difference learning (application)
  • Maximization Bias
  • Model-based RL
  • Contraction Mapping
  • Banach's Fixed Point Theorem

Homework 2 (grade: 18.5/20)

Topics:

  • Gradient Descent Methods
  • Basis Functions
  • Geometry of linear value-function approximation
  • Neural Networks in RL
  • REINFORCE
  • Compatible Function Approximation Theorem
  • Natural Gradient

Copyright

Copyright © 2019 Davide Belli.

This project is distributed under the MIT license. Please follow the UvA regulations governing Fraud and Plagiarism in case you are a student.

About

Labs and Homeworks for Reinforcement Learning course, MSc AI @ UvA 2018/2019

License:MIT License


Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%