AlessandroRestagno / d4pg

async distribution ddpg

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

d4pg

This is a simple implement of d4pg.

distribution value and async.

I use qr dqn's value distribution with MSE loss to replace c51 dqn's distribution. More informations about qr dqn are in my another profile c51-qr-dqn

original code is from morvan.

To do

prioritized experience replay, and n-step return. In Pendulum, I think multi-step return may not be better than one-step return after serveral experiments.

paper

Distributed Distributional Deterministic Policy Gradients

About

async distribution ddpg


Languages

Language:Python 100.0%