pytorch algorithm reinforcement-learning dqn ddpg actor-critic policy-gradients a2c a3c sac td3 double-dqn dueling-dqn sarsa trpo

Deep-RL-with-pytorch

Practice for deep reinforcement learning algorithms by a starter.
Test environment is Gym-CartPolev0 for discrete action space and Gym-PendulmV0 for continuous action space.
Under active development.
Including:DQN, REINFORCE, baseline-REINFORCE, Actor-Critic, Double DQN, Dueling DQN, Sarsa, DDPG, DDPG for discrete action space, A2C, A3C, TD3, SAC, TRPO

2020-9-19 implement

algorithm:

1.DQN
2.REINFORCE

components:

1.experience replay

2020-9-20 implement

algorithm:

1.baseline-REINFORCE
2.Actor-Critic

Add CUDA support

2021-1-15 implement

algorithm:

1.Double DQN
2.Dueling DQN

2021-1-19 implement

algorithm:

1.Sarsa

2021-1-23 implement

algorithm:

1.DDPG
2.DDPG for discrete action space using gumbel softmax

2021-1-26 implement

algorithm:

1.A2C

2021-1-27 implement

algorithm:

1.A3C

2021-2-4 implement

algorithm:

1.TD3
2.SAC

2021-2-25 implement

algorithm:

1.TRPO(Natural Policy gradient).
Unknown bug exists: Hessian matrix may not be positive definite at the beginning of training(But the training will usually converge)

About

Basic reinforcement learning algorithms. Including:DQN,Double DQN, Dueling DQN, SARSA, REINFORCE, baseline-REINFORCE, Actor-Critic,DDPG,DDPG for discrete action space, A2C, A3C, TD3, SAC, TRPO

pytorch algorithm reinforcement-learning dqn ddpg actor-critic policy-gradients a2c a3c sac td3 double-dqn dueling-dqn sarsa trpo

Languages

Language:Python 100.0%