For The Win (FTW) Agent implementation.

Overview | Installation | Documentation | Examples

This repository has the following goals:

implement the neural network module and
the population-based training framework

introduced by the 2019 paper "Human-level performance in 3D multiplayer games with population-based reinforcement learning" by Jaderberg et al. (available at https://science.sciencemag.org/content/364/6443/859).

The implementation is based on TensorFlow 2, and makes use of the dm-sonnet, dm-acme and dm-reverb libraries offered by Deepmind.

Overview

This repository offers the following:

The following neural network modules based on the FTW paper
- Visual embedding (Convolutional neural network)
- DNC memory (taken from dnc and modified to be compatible to TensorFlow 2)
- Variational Unit
- Recurrent processing with temporal hierarchy
- Auxiliary task modules: Pixel control & Reward prediction
FtwNetwork, RNNPixelControlNetwork & RewardPredictionNetwork classes that can be used to combine the above modules into an end-to-end network suitable for the FTW agent
Replay Buffers for Pixel control & Reward prediction auxiliary tasks (adders and datasets for use with dm-acme and dm-reverb)
Agent, Actor & Learner classes for the FTW agent
Support for multi-agent environments (under certain constraints, see Documentation)
Hyperparameter and Internal Rewards classes for population-based training.
Arena & Chief classes for population-based training in multi-agent environments (see FTW paper for more details)
A FTWJobPool class that can be used to spawn multiple threads of FTW learners and Arena instances.

However, this repository is still work-in-progress. As such, the following features are not supported/implemented yet:

In the current state, population-based training is not yet fully implemented: The Chief class responsible for the evolution of the agent population is still work in progress. Therefore, while it can already be used in the training of a population of agents, it does not actually execute any evolutionary methods, such as mutation of hyperparameters.
Currently, the policy module does not support decomposed action spaces such as the one featured in the FTW paper.
Similarly, the Pixel control module does not support decomposed action spaces at the moment.

These features will be added in the near future.

Installation

Currently, only Linux based OSes are supported (due to dm-reverb).

Clone/download the repository.
Go to the repository folder
```
 cd ftw/
```
It is highly recommended to use a Python virtual environment to manage your dependencies in order to avoid version conflicts:
```
 python3 -m venv ftw
 source ftw/bin/activate
 pip install --upgrade pip setuptools
```
Install with (mind the dot at the end!)
```
 pip install -e .
```
If you want to use the multi-agent examples, you'll need to additionally install ma-gym. Warning: do not install ma-gym from github, since the version requirements will install older versions of libraries that are incompatible with our version requirements. However, if you have followed the steps so far and install ma-gym from the third_party module in this repository, the examples will work just fine. If you are in the ftw root directory, type
```
 cd ftw/third_party/ma_gym
 pip install -e .
```

Again, mind the dot at the end!
That's it. You're all set!

Documentation

For more specific information, please visit the Documentation.

Examples

Further examples can be found here.

benhoff / ftw

For The Win (FTW) Agent implementation.

Overview

Installation

Documentation

Examples

About

Languages