Adriel-M / OpenAI-Gym-Solutions

Solutions to OpenAI-Gym environments.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OpenAI-Gym-Solutions

Solutions to OpenAI-Gym environments using various machine learning algorithms.

Strategies

Evolutionary Learning Strategy: Start with some initial weights and generate weights for each member in the population (by adding noise to the current weight). Track performance of each member's performance and update current weight (by a weighted sum). Learn more about Evolutionary Learning Strategies here.

REINFORCE: Monte Carlo Policy Gradient Perform gradient ascent after every episode on the weighted sum at time t, the probability of taking the particular action multiplied by the expected total discounted reward following the current policy. A discounted reward is used to enforce finishing early. Learn more about REINFORCE: Monte Carlo Policy Gradient by reading Reinforcement Learning: An Introduction (chapter 13.3) by Sutton & Barto here. (Book is unfinished and chapters order may change)

Evaluations

LunarLander-v2

ELS 1 2 3

CartPole-v0

ELS 1

REINFORCE-MCMC 1 2

CartPole-v1

ELS 1

REINFORCE-MCMC 1 2