JoshVarty / BananaCollector_DoubleQLearning

An implementation of DoubleQLearning to solve Unity's Banana Environment

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Banana Collector

Unity's Banana Collector Environment is an environment in which an agent must collect as many yellow bananas (+1) as possible while avoiding blue bananas (-1).

The agent interacts with the environment via the following:

  • It is fed observations of the current state via a vector of 37 elements
  • It can choose to make any of 4 actions: (Left, Forward, Right or Back)

Banana Collector Environment

Sample image taken from: https://github.com/udacity/deep-reinforcement-learning/tree/master/p1_navigation

This repository trains an agent to attain an average score (over 100 episodes) of at least 13. It trains the agent using the Double DQN Reinforcement Learning algorithm.

Prerequisites

  • Anaconda

  • Python 3.6

  • A conda environment created as follows

    • Linux or Mac:
    conda create --name drlnd python=3.6
    source activate drlnd 
    
    • Windows
    conda create --name drlnd python=3.6 
    activate drlnd
    
  • Required dependencies

git clone https://github.com/udacity/deep-reinforcement-learning.git
cd deep-reinforcement-learning/python
pip install .

Getting Started

  1. git clone https://github.com/JoshVarty/BananaCollector_DoubleQLearning.git

  2. cd BananaCollector_DoubleQLearning

  3. Download Unity Banana Collector Enviroment:

  4. Unzip to git directory

  5. jupyter notebook

  6. You can train your own agent via DQN.ipynb or watch a single episode of the pre-trained network via Visualization.ipynb

Results

In my experience the agent achieves an average score of 13 after ~400 episodes of training:

image

A sample run generated from Visualization.ipynb

Notes

  • Only tested on Ubuntu 18.04
  • Details of the learning algorithm and chosen architecture may be found in Report.md

About

An implementation of DoubleQLearning to solve Unity's Banana Environment

License:MIT License


Languages

Language:Jupyter Notebook 86.5%Language:Python 13.5%