akashe / DeepReinforcementLearning

Deep RL implementations. DQN, SAC, DDPG, TD3, PPO and VPG implemented in pytorch. Tested Env: LunarLander-v2 and Pendulum-v0.

Home Page:https://akashe.io/blog/2020/10/14/policy-gradient-methods/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deep RL algorithms implemented using Pytorch

Algo list:

  1. DQN
  2. Vanilla policy Gradient
  3. Deep Deterministic Policy Gradient
  4. Twin Delayed Deep Deterministic Policy Gradient
  5. Soft Actor Critic
  6. Proximal Policy Optimization - CLIP
Article on deeper Look into policy gradients

Experimental Results:

Algorithm Discrete Env: LunarLander-v2 Continuous Env: Pendulum-v0
DQN LunnarLander-DQN -
VPG LunarLander-VPG -
DDPG - Pendulum-DDPG
TD3 - Pendulum-TD3
SAC - Pendulum-SAC
PPO - Pendulum-PPO

Usage:

Just run the file/algorithm directly. There is no common structures between algorithms as I implemented them as I learnt them. Different algorithms are inspired from different sources.

Resources:

  1. RL course by David Silver
  2. Lecture slides for above course
  3. Spinning up by OpenAI
  4. More exhaustive RL guide by Deeny Britz

Future projects:

  1. If time available I will add a simple program for elevator using RL.
  2. Better graphs

About

Deep RL implementations. DQN, SAC, DDPG, TD3, PPO and VPG implemented in pytorch. Tested Env: LunarLander-v2 and Pendulum-v0.

https://akashe.io/blog/2020/10/14/policy-gradient-methods/


Languages

Language:Python 74.6%Language:Jupyter Notebook 25.4%