JacobHA / Actor-Critic-PyTorch

Policy Gradient Actor-Critic PyTorch | Lunar Lander v2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Actor-Critic

Solution for Lunar Lander environment v2 of Open AI gym. The algorithm used is actor-critic (vanilla policy gradient with baseline),

more info : http://rail.eecs.berkeley.edu/deeprlcourse-fa17/f17docs/lecture_5_actor_critic_pdf.pdf

-> Dependencies:

    OpenAI gym

    PyTorch 0.4.1

    PIL

-> Hyperparameters can be changed by editing them in respective files

-> To train : run train.py

-> Converges within 1500 episodes

-> To test a pretrained model : run test.py

alt-text

About

Policy Gradient Actor-Critic PyTorch | Lunar Lander v2

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%