Khev / RL-practice-keras

Deep Reinforcement Learning

Here I am implementing various RL algorithms, using python 2.7. I will use keras for the neurals nets. I'm going to use the OpenAI gym to test the algorithms. I list the methods below, which roughly divide into two categories.

I took / adjusted code from various online sources, which I inexhaustively list below (and in the code itself).

Value based methods

Q-learning (tabular)
Deep Q-Network (DQN)
Double DQN (DDQN)
DQN with prioritised replay
Distributional bellman

Policy based methods

Policy gradient -- REINFORCE & with baseline.
Actor critic (A2C)
Deep Deterministic Policy Gradient (DDPG)
Proximal policy optimization (PPO)
Soft Actor-Critic (soft AC)

Multi-agent

Muti-agent deep deterministic policy gradient (MADDPG)
Actor-Attention-Critic (AAC)
Value Decompostion Networks (VDN)
QMIX

Others

Explore-and-go
Curiosity driven learning (CDL)
Rainbow (RB)

Resources

Papers

Q-learning
DQN
Dueling DQN
DQN with prioritized replay
Distriubtional Bellman
PPO
Soft AC
DDPG
MADDPG
AAC
CDL
RB
EG
MARL review article
QMIX
VDN

Blogs

Arthur Juliana
yanpanlau
Gumble softmax trick
review_blog
TRPO intro

Textbooks

Sutton

Acknowledgements

@germain-hug
@Keras-RL
@keon

About

Languages

Language:Jupyter Notebook 94.8%Language:Python 5.2%

Links

ProductDiscover

Data Powerby api.github.com. Remove your profile on the Giters? Go to settings.

Contact Site Admin: Giters.