ffelten / mo-gym

Multi-objective gym environments for reinforcement learning.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tests Project Status: Active – The project has reached a stable, usable state and is being actively developed. License Discord Code style: black Imports: isort

MO-Gym: Multi-Objective Reinforcement Learning Environments

Gym environments for multi-objective reinforcement learning (MORL). The environments follow the standard gym's API, but return vectorized rewards as numpy arrays.

For details on multi-objective MDP's (MOMDP's) and other MORL definitions, see A practical guide to multi-objective reinforcement learning and planning.

Install

Via pip:

pip install mo-gym

Alternatively, you can install the newest unreleased version:

git clone https://github.com/LucasAlegre/mo-gym.git
cd mo-gym
pip install -e .

Usage

import gym
import mo_gym

env = mo_gym.make('minecart-v0') # It follows the original gym's API ...

obs = env.reset()
next_obs, vector_reward, terminated, truncated, info = env.step(your_agent.act(obs))  # but vector_reward is a numpy array!

# Optionally, you can scalarize the reward function with the LinearReward wrapper
env = mo_gym.LinearReward(env, weight=np.array([0.8, 0.2, 0.2]))

MO-Gym Demo in Colab You can also check more examples in this colab notebook!

Environments

Env Obs/Action spaces Objectives Description
deep-sea-treasure-v0
Discrete / Discrete [treasure, time_penalty] Agent is a submarine that must collect a treasure while taking into account a time penalty. Treasures values taken from Yang et al. 2019.
resource-gathering-v0
Discrete / Discrete [enemy, gold, gem] Agent must collect gold or gem. Enemies have a 10% chance of killing the agent. From Barret & Narayanan 2008.
fishwood-v0
Discrete / Discrete [fish_amount, wood_amount] ESR environment, the agent must collect fish and wood to light a fire and eat. From Roijers et al. 2018.
fruit-tree-v0
Discrete / Discrete [nutri1, ..., nutri6] Full binary tree of depth d=5,6 or 7. Every leaf contains a fruit with a value for the nutrients Protein, Carbs, Fats, Vitamins, Minerals and Water. From Yang et al. 2019.
breakable-bottles-v0
Discrete (Dictionary) / Discrete [time_penalty, bottles_delivered, potential] Gridworld with 5 cells. The agents must collect bottles from the source location and deliver to the destination. From Vamplew et al. 2021.
four-room-v0
Discrete / Discrete [item1, item2, item3] Agent must collect three different types of items in the map and reach the goal. From Alegre et al. 2022.
water-reservoir-v0 Continuous / Continuous [cost_flooding, deficit_water] A Water reservoir environment. The agent executes a continuous action, corresponding to the amount of water released by the dam. From Pianosi et al. 2013.
mo-mountaincar-v0
Continuous / Discrete [time_penalty, reverse_penalty, forward_penalty] Classic Mountain Car env, but with extra penalties for the forward and reverse actions. From Vamplew et al. 2011.
mo-reacher-v0
Continuous / Discrete [target_1, target_2, target_3, target_4] Reacher robot from PyBullet, but there are 4 different target positions. From Alegre et al. 2022.
minecart-v0
Continuous or Image / Discrete [ore1, ore2, fuel] Agent must collect two types of ores and minimize fuel consumption. From Abels et al. 2019.
mo-highway-v0 and mo-highway-fast-v0
Continuous / Discrete [speed, right_lane, collision] The agent's objective is to reach a high speed while avoiding collisions with neighbouring vehicles and staying on the rightest lane. From highway-env.
mo-supermario-v0
Image / Discrete [x_pos, time, death, coin, enemy] Multi-objective version of SuperMarioBrosEnv. Objectives are defined similarly as in Yang et al. 2019.
mo-halfcheetah-v4
Continuous / Continuous [velocity, energy] Multi-objective version of HalfCheetah-v4 env. Similar to Xu et al. 2020.
mo-hopper-v4
Continuous / Continuous [velocity, height, energy] Multi-objective version of Hopper-v4 env.

Citing

If you use this repository in your work, please cite:

@inproceedings{Alegre+2022bnaic,
  author = {Lucas N. Alegre and Florian	Felten and El-Ghazali Talbi and Gr{\'e}goire Danoy and Ann Now{\'e} and Ana L. C. Bazzan and Bruno C. da Silva},
  title = {{MO-Gym}: A Library of Multi-Objective Reinforcement Learning Environments},
  booktitle = {Proceedings of the 34th Benelux Conference on Artificial Intelligence BNAIC/Benelearn 2022},
  year = {2022}
}

Acknowledgments

About

Multi-objective gym environments for reinforcement learning.

License:MIT License


Languages

Language:Python 87.1%Language:Jupyter Notebook 12.9%