solazu / Hindsight-Experience-Replay---Bit-Flipping

Simple bit flipping with sparse rewards using HER, similarly to the original paper

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hindsight-Experience-Replay---Bit-Flipping

Simple bit flipping with sparse rewards using HER, similarly to the original paper.

This project contains a simple implementation of the bit flipping experiment from the paper "Hindsight Experience Replay" [1] (HER) by OpenAI researchers. In this experiment an agent is given an initial binary state vector and a binary goal vector, and must get from the initial state to the goal state by flipping a bit at each step. The agent is given -1 reward for every step that the goal is not reached and 0 when it reaches it, making it a sparse reward problem. using HER, the agent can gradualy increase its reachable set and eventually arrive at the intented goals.

For the 15 bit experiment, after 5000 epsidoes the agent can get to the goal ~90% of the time: Alt text

During training, the minimum distance to the goal during each episode can be seen to gradually decrease: Alt text

I have written a Medium post explaining the intuition behind Hindsight Experience Replay, feel free to check it out.

  1. Andrychowicz, Marcin, et al. "Hindsight experience replay." Advances in Neural Information Processing Systems. 2017.

About

Simple bit flipping with sparse rewards using HER, similarly to the original paper


Languages

Language:Python 100.0%