Fast Efficient Hyperparameter Tuning for Policy Gradients (https://arxiv.org/abs/1902.06583)
Implementation of HOOF for A2C and TNPG. The code is based on OpenAI Baselines implementation: https://github.com/openai/baselines
To run the code:
- Add your MuJoCo key to the folder
- Build the docker with build.sh and then run it with run.sh
- The parameters for each environment is in the yaml file
- Run the code with run_A2C.sh or run_TRPO_TNPG.sh with the yaml filename as argument
- For other environments, just create a new yaml file and add it to the folder And that's it!