nucl.ai 2016 Hands-On Deep Q-Learning in Doom Workshop
Prerequisites
- Python 2.7
- Numpy,
- Skimage (scikit-image)
- Theano
- Lasagne (the bleeding-edge version)
- ViZDoom
Task 1: Learn the capabilities of the environment
- Analyze and execute examples/python/basic.py.
- Play with the rendering options (hud, crosshair, resolution, etc.)
- Learn how to use the modes:
SPECTATOR
,PLAYER
,ASYNC_SPECTATOR
,ASYNC_PLAYER
- Play with the
tics
parameter ofdoom.make_action()
- See the sample scenarios in
ASYNC_SPECTATOR
mode, observe the rewards on the console (examples/python/scenarios.py) - Speed up the time in
ASYNC_SPECTATOR
mode (examples/python/ticrate.py). - Play with the other capabilities of ViZDoom: - Benchmark your ViZDoom offscreen (examples/python/fps.py) - See the depth buffer feature (examples/python/format.py) - Execute multiple instances of Doom at once (examples/python/multiple_instances.py) - Record the game in low-resolution settings and replay it in high-resolution (examples/python/record_episodes.py)
- Play the examples/python/cig_bots.py to get the feeling of the CIG 2016 ViZDoom Competition
Task 2: Teach your agent to play a basic scenario
- Clone this repo
- Copy (or make symlinks), so that the system sees
vizdoom.so
and thevizdoom
executable, e.g.:
ln -s ../ViZDoom/bin/vizdoom
ln -s ../ViZDoom/bin/python/vizdoom.so
- Execute the
simple_dqn_task.py
, which runs onsimpler_basic
scenario, by default. * Note: see run_cpu.sh - Analyze the code. Most of the logic is already there. The only thing missing is the body of the
perform_learning_step()
function. It is your job to fill it up. Follow the instructions in the comments. - When your code produces competent agents (mean score of 70-80 points), you might want to see your agent in action in better resolution and color, for which you have two options:
* Increase the Doom's resolution and set the screen format to
RGB24
insimpler_basic.cfg
. (You also need to uncomment thergb2gray()
call inpreprocess()
). Note that, with higher resolution and color space conversion, the training will take 2-4 more time to complete. * Record your agent actions and replay it in rendering settings of your choice (consultrecord_episodes.py
example). This option does not require retraining the agent. - You might benchmark yourself on this scenario. Can you been your bot?
- Next, you can try your code on the
rocket_basic
scenario, which is slightly harder. * Hint: let the agent learn for 40-50 epochs.
Notes
- The scenarios and the code for learning have been prepared so that training your agent should take approximately 1-10 minutes on a CPU, depending on the scenario and rendering settings.
- You might want to play with
learning_rate
andframes_repeat
parameters - The solution to Task 2 is provided in
simple_dqn_solution.py
. Just play with it, if you are not in the mood of a bit of coding.
Documentation
Currently, ViZDoom documentation consists of a tutorial and examples.
Authors
- Wojciech Jaśkowski
- Michał Kempka