jcwleo / Reinforcement_Learning

강화학습에 대한 기본적인 알고리즘 구현

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reinforcement Learning

여러 환경에 적용해보는 강화학습 예제(파이토치로 옮기고 있습니다)

Alt text

[Breakout / Use DQN(Nature2015)]

1. Q-Learning / SARSA

2. Q-Network (Action-Value Function Approximation)

3. DQN

DQN(NIPS2013)은 (Experience Replay Memory / CNN) 을 사용.

DQN(Nature2015)은 (Experience Replay Memory / Target Network / CNN) 을 사용

5. Vanilla Policy Gradient(REINFORCE)

6. Advantage Actor Critic

7. Deep Deterministic Policy Gradient

8. Parallel Advantage Actor Critic(is called 'A2C' in OpenAI)

9. C51(Distributional RL)

10. PPO(Proximal Policy Optimization)

About

강화학습에 대한 기본적인 알고리즘 구현


Languages

Language:Python 94.0%Language:Jupyter Notebook 6.0%