deep-reinforcement-learning pytorch ddqn deep-q-learning

Double Deep Q Learning (DDQN) In PyTorch

DDQN inplementation on PLE FlappyBird environment in PyTorch.

DDQN is proposed to solve the overestimation issue of Deep Q Learning (DQN). Apply separate target network to choose action, reducing the correlation of action selection and value evaluation.

Requirement

Python 3.6
Pytorch
Visdom
PLE (PyGame-Learning-Environment)
Moviepy

Algorithm

In this implementation, I update policy network per episode e not per step t.
Simplify input images for faster convergence.

Usage

HyperParameter in config.py
Train

python main.py --train=True --video_path=./video --logs_path=./logs

Restore Pretrain Model

python main.py --restore=./pretrain/model-98500.pth

Visualize loss and reward curve

python -m visdom.server
python visualize.py --logs_path=./logs

Result

Full Video (with 60 FPS)
Reward

Reference

About

DDQN inplementation on PLE FlappyBird environment in PyTorch.

deep-reinforcement-learning pytorch ddqn deep-q-learning

Languages

Language:Python 100.0%