Bi-level Actor-Critic for Multi-agent Coordination

This is the code for implementing the MADDPG-based algorithms (Bi-AC, MADDPG) presented in the paper: Bi-level Actor-Critic for Multi-agent Coordination.

It is base on the Multi-Agent Reinforcement Learning Framework: malib.

It is configured to be run in conjunction with a slightly changed environment, original from the highway-env.

Installation

To install, you need to follow the same routine to install malib.
Main dependencies: Python (3.6), OpenAI gym (0.14.0), tensorflow (2.0.0), numpy (1.17.0), matplotlib, pickle.

Quick Start

bilevel_pg

This is the menu for the matrix game setting shown in the paper. To run the experiment in this menu, run:

cd bilevel_pg/experiments
python run_trainer.py

bilevel_pg_highway_1x1

This is the menu for the highway-env setting shown in the paper. To run the experiment in this menu, run:

cd bilevel_pg_highway_1x1/bilevel_pg

Thus, you enter the menu where all the training code are given, you may any of the algorithms given. For example, for running Bi-AC:

python run_trainer_highway.py

bully_q

This is the menu where we test the Bi-Q method without neural netowrk. To run the experiment for Bi-Q, run:

cd bully_q
python bilevelq_vs_table_q.py

Paper citation

If you used this code for your experiments or found it helpful, consider citing the following paper:

@article{zhang2019bi,
  title={Bi-level Actor-Critic for Multi-agent Coordination},
  author={Zhang, Haifeng and Chen, Weizhe and Huang, Zeren and Li, Minne and Yang, Yaodong and Zhang, Weinan and Wang, Jun},
  journal={arXiv preprint arXiv:1909.03510},
  year={2019}
}

laonahongchen / Bilevel-Optimization-in-Coordination-Game