Clean, Robust, and Unified implementation of classical Deep Reinforcement Learning Algorithms
![DRL](https://camo.githubusercontent.com/9ceec6c2d2cac335d15d84c09891e32e84c1eebe131c7df9c627ffeeee7f2234/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f44524c2d626c756576696f6c6574)
Recommended Resources for DRL
DQN: Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning[J]. nature, 2015, 518(7540): 529-533.
Double DQN: Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double q-learning[C]//Proceedings of the AAAI conference on artificial intelligence. 2016, 30(1).
PER: Schaul T, Quan J, Antonoglou I, et al. Prioritized experience replay[J]. arXiv preprint arXiv:1511.05952, 2015.
PPO: Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms[J]. arXiv preprint arXiv:1707.06347, 2017.
DDPG: Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[J]. arXiv preprint arXiv:1509.02971, 2015.
TD3: Fujimoto S, Hoof H, Meger D. Addressing function approximation error in actor-critic methods[C]//International conference on machine learning. PMLR, 2018: 1587-1596.
SAC: Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//International conference on machine learning. PMLR, 2018: 1861-1870.
ASL: Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity
Training Curves of my Code:
![](https://github.com/XinJingHao/Q-learning/raw/main/result.svg?raw=true)
![](https://github.com/XinJingHao/DQN-DDQN-Pytorch/raw/main/IMGs/DQN_DDQN_result.png?raw=true)
Pong |
Enduro |
![](https://github.com/XinJingHao/DQN-DDQN-Atari-Pytorch/raw/main/IMGs/Pong.png?raw=true) |
![](https://github.com/XinJingHao/DQN-DDQN-Atari-Pytorch/raw/main/IMGs/Enduro.png?raw=true) |
CartPole |
LunarLander |
![](https://github.com/XinJingHao/Prioritized-DQN-DDQN-Pytorch/raw/main/LightPriorDQN_gym0.2x/IMGs/CPV1.svg?raw=true) |
![](https://github.com/XinJingHao/Prioritized-DQN-DDQN-Pytorch/raw/main/LightPriorDQN_gym0.2x/IMGs/LLDV2.svg?raw=true) |
![](https://github.com/XinJingHao/PPO-Discrete-Pytorch/raw/main/result.jpg?raw=true)
![](https://github.com/XinJingHao/PPO-Continuous-Pytorch/raw/main/ppo_result.jpg?raw=true)
Pendulum |
LunarLanderContinuous |
![](https://github.com/XinJingHao/DDPG-Pytorch/raw/main/IMGs/ddpg_pv0.svg?raw=true) |
![](https://github.com/XinJingHao/DDPG-Pytorch/raw/main/IMGs/ddpg_lld.svg?raw=true) |
![](https://github.com/XinJingHao/TD3-Pytorch/raw/main/images/TD3results.png?raw=true)
![](https://github.com/XinJingHao/SAC-Continuous-Pytorch/raw/main/imgs/result.jpg?raw=true)
![](https://github.com/XinJingHao/SAC-Discrete-Pytorch/raw/main/imgs/sacd_result.jpg?raw=true)