BioGeek / qd-skill-discovery-benchmark

Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ICLR 2023 - Neuroevolution is a competitive alternative to reinforcement learning for skill discovery

PyPI Python Version Jax 0.3.10 Code style: black pre-commit

alternate text

This repository contains the code for the paper "Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery.", (Chalumeau, Boige et al., 2023) 💻 ⚡.

First-time setup

Install Docker

This code requires docker to run. To install docker please follow the online instructions here. To enable the code to run on GPU, please install Nvidia-docker (as well as the latest nvidia driver available for your GPU).

Build and run a docker image

Once docker and docker Nvidia are installed, you can simply build the docker image with the following command:

make build

and, once the image is built, start the container with:

make dev_container

Inside the container, you can run the nvidia-smi command to verify that your GPU is found.

Run experiments from the paper

Training

Syntax

To train an algorithm on an environemnt use the following command:

make train script_name=[SCRIPT_NAME] env_name=[ENV_NAME]

This will load the default config in qdbenchmark/training/config and launch the training. The list of available scripts and configs are present in qdbenchmark/training/.

Example

For instance to train MAP-ELITES on the environment Halfcheetah Uni run:

make train script_name=train_map_elites.py env_name=halfcheetah_uni

Adaptation

Syntax

To perform adaptations tasks, the user must provide a path to the policy/repertoire resulting from the training of an agent as well as the path to the config used to train this agent. Three commands are available, one for each of the three adaptation-task families (Gravity multiplier, Actuator update or Default position change).

For the QD algorithms:

make adaptation_gravity_qd repertoire_path=[PATH_TO_REPERTOIRE] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
make adaptation_actuator_qd repertoire_path=[PATH_TO_REPERTOIRE] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
make adaptation_position_qd repertoire_path=[PATH_TO_REPERTOIRE] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]

and for the Skill Discovery algorithms:

make adaptation_gravity_sd policy_path=[PATH_TO_POLICY] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
make adaptation_actuator_sd policy_path=[PATH_TO_POLICY] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]
make adaptation_position_sd policy_path=[PATH_TO_POLICY] run_config_path=[PATH_TO_CONFIG] env_name=[ENV_NAME] algorithm_name=[ALGORITHM_NAME]

Example

Sample configs and checkpoints are provided to give an example, for instance the user can run the following command:

make adaptation_gravity_sd policy_path=sample/sample_policy/dads-reward-ant-uni-policy-0.npy run_config_path=sample/sample_config/dads_reward_ant_uni.yaml env_name=ant_uni algorithm_name=DADS+REWARD

Hierarchical

Syntax

To perform the Halfcheetah-Hurdle hierarchical task, the user should also provid a policy/repertoire path. The syntax is the following:

For QD algorithms:

make hierarchical_qd repertoire_path=[PATH_TO_REPERTOIRE] algorithm_name=[ALGORITHM_NAME]

For Skill Discovery algorithms:

make hierarchical_sd policy_path=[PATH_TO_POLICY] algorithm_name=[ALGORITHM_NAME]

Example

Sample checkpoints are provided to give an example, for instance the user can run the following command:

make hierarchical_qd repertoire_path=sample/sample_repertoire/map_elites_halfcheetah_uni/ algorithm_name=MAP-ELITES

Contributors

Citing this work

If you use the code or data in this package, please cite:

@inproceedings{
chalumeau2023neuroevolution,
title={Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery},
author={Felix Chalumeau and Raphael Boige and Bryan Lim and Valentin Mac{\'e} and Maxime Allard and Arthur Flajolet and Antoine Cully and Thomas PIERROT},
booktitle={International Conference on Learning Representations},
year={2023},
url={https://openreview.net/forum?id=6BHlZgyPOZY}
}

About

Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery


Languages

Language:Python 96.4%Language:Makefile 1.9%Language:Dockerfile 0.9%Language:Shell 0.7%