This is a series of models I worked on while studying Reinforcement Learning.
Resource study:
- Sutton and Barto's Reinforcement Learning book
- David Silver lectures
- AI-Core Reinforcement Learning course
- Andrej Karpathy - Deep Reinforcement Learning
This is based on the classic DeepMind Paper which has a very popular video of the Breakout game. This video was my biggest inspiration to study A.I.
The model employs a DQN algorithm with 2 Convolutional Neural Networks (Fixed-Q Learning) to approximate a value function to learn to play the game. It also utilize the Experience Replay technique.
The method used was TD (Temporal-Difference) Learning. A trained agent can be found in the folder agents.
Implementation of Deep Q Network to solve CartPole environment with a simple Neural Network, TD Learning, Q-target and Experience Replay.
Q Learning implementation to solve CartPole environment with a simple Neural Network. A trained agent can be found in the folder agents.
Q Learning implementation to solve MountainCar environment with a simple Neural Network. A trained agent can be found in the folder agents.
A tabular classical Q-Learning was used to implement the Taxi-v3 environment. As it's a simpler and discrete game the classical model could easily solve it.
An interesting approach was used to make it possible to apply a classical tabular Q-Learning on a more complex continuous environment like MountainCar using numpy functions of digitize and linspace. Furthermore, it counts with the implementation of two control methods for TD which are SARSA and Q-Learning.
Implementation of REINFORCE, a Monte Carlo policy-gradient algorithm in the LunarLander-v2. A trained agent can be found in the folder agents.
Implementation of REINFORCE, a Monte Carlo policy-gradient algorithm in the Acrobot-v1.
This is an implementation of the interesting Actor-Critic algorithm which is roughly a way to take the best of the techniques of Policy-Based and Value-Based approaches. This one was implemented in the LunarLander-v2 environment. A trained agent can be found in the folder agents. And recordings can be found in the folder recordings.
Playing a Gym Atari game in the jupyter notebook
- conda 4.8.2
- python 3.7.4
- ptorch 1.4.0
- open-ai gym 0.17.1