Scudstorm is an agent parameterized by a deep neural network - trained via a genetic algorithm in Tensorflow eager - to play Entelect Software's 2018 Tower Defense game.
As seen at the Deep Learning Indaba 2018 poster session. For reference, here is a pdf version of the poster
The aim of this project was
- To see if I could make a genetic algorithm train large agents with autoregressive policies
- To see how efficient I could make an environment training pipeline between a Java game and python.
/logs/
contains the tensorboard log files for each run./runs/
contains the directory where the python environment manager runs the games in parallel. Do not put things in this folder, it is automatically populated and managed from the python env scripts./refbot/
contains some initialization for the reference bot which plays against each agent./saves/
is where the kerasModel
saves are saved to./common/
contains various utilities, initialization and other scripts./deploy/
contains code for running on Entelect's server, optimized for speed of inference.
To run Scudstorm, simply go (with one of the options indicated):
python manager.py --mode [train, resume, test, rank]
train
mode starts the training process from scratch, randomly initializing all the agents.resume
resumes training from the agent saves in the/saves/checkpoints/
folder.test
runs various tests to check that your environment works correctly and is ready for training/playing. This is essentially "test your local setup"rank
plays several games in a round-robin style to rank all the saves of agents in the/saves/
folder. It outputs each agents win percentage and ELO score after it is done.
The keras Model
does not like it if you try and do anything non-standard within the model (e.g. like doing a one-hot, sampling, even just a matmul tf op) between keras Layers in the model. To get around this, one must create keras Layers which wrap base tf/keras ops and then just use them as you would any other layers. So that is why we have this custom_layers.py
which contains wrappers for a one-hot encoder and a categorical sampling from logits:
class SampleCategoricalLayer(tf.keras.Layers.Layer):
def call(self, x):
dist = tf.distributions.Categorical(logits=x)
return dist.sample()
class OneHotLayer(tf.keras.Layers.Layer):
def call(self, x):
meme = tf.keras.backend.one_hot(x, num_classes=self.num_classes)
return meme
The code should work on Windows 10, and newer versions of linux. Please run requirements.txt
to get all necessary dependencies. Any recent version of Java is also required to simulate the Entelect Tower Defense game.
- Uber AI's paper on deep neuroevolution was an enormous inspiration and the core genetic algorithm I use is very similar to the one they use in their paper.
- OpenAI baselines served as quite a large inspiration for parts of the environment pipeline, namely the
subproc_env_manager.py
code. - Parts of Inception-v4-resnet's network architecture was used for parts of Scudstorm's network.
- Entelect Software's 2018 Tower defense game is used. The .jar in this folder is their game/code. The
StarterBotPrime.py
is also a modified version of their python starter bot.
Feel free to use and modify this code really however you see fit, but please do acknowledge me as the original source. I am Matthew Baas, a student at Stellenbosch University.