PPO Continuous Action Space
raunakdoesdev opened this issue · comments
What changes would be required to employ your ppo algorithm in a continuous action space like Pendulum-v0?
It was too late, but I made similar code-style continuous-ppo version and sent pull request. It doesn't perform well, but check it out.
Added! Thanx :)