This repository contains the code for the paper "Fast Population-Based Reinforcement Learning on a Single Machine paper from InstaDeep", (Flajolet et al., 2022) 💻⚡.
This code requires docker to run. To install docker please follow the online instructions here. To enable the code to run on GPU, please install Nvidia-docker (as well as the latest nvidia driver available for your GPU).
Once docker and docker Nvidia are installed, you can simply build the docker image with the following command:
make build
and, once the image is built, start the container with:
make dev_container
Inside the container, you can run the nvidia-smi
command to verify that your GPU is found.
We provide scripts and commands to replicate the experiments discussed in the paper. All these commands are defined in the Makefile at the root of the repository.
To replicate the experiments corresponding to Figure 2 (where we measure the runtime of a population-wide update step with various implementations), run:
make run_timing_sactd3
make run_timing_dqn
To replicate the experiments discussed in Section 5 (which correspond to full training runs), run the following:
make run_td3_cemrl
make run_td3_dvd
make run_td3_pbt
make run_sac_pbt
Note that dvd training runs are unstable and sometimes crash early on due to NaNs.
We use tensorboard
to log metrics during the training run. The tensorboard command
to run to visualize them is printed when the experiment starts.
Run the following command to start a short test which validates that the code in the training scripts is working as expected.
make test_training_scripts
If you use the code or data in this package, please cite:
@inproceedings{flajolet2022fast,
title={Fast Population-Based Reinforcement Learning on a Single Machine},
author={Flajolet, Arthur and Monroc, Claire Bizon and Beguir, Karim and Pierrot, Thomas},
booktitle={International Conference on Machine Learning},
pages={6533--6547},
year={2022},
organization={PMLR}
}