rl-study-group-supreme-octo-memory Schedule Week 1: Introduction MDP Dynamic Programming Week 2: Monte Carlo Model-Free Prediction & Control Temporal Difference Model-Free Prediction & Control Week 3: Function Approximation Deep Q Learning Policy Gradient Methods