philip-jordan / iProx-CMPG

Simulations code for AISTATS'24 paper on "Independent Learning in Constrained Markov Potential Games"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Independent Learning in Constrained Markov Potential Games

This repository provides the code for simulations of our independent learning algorithm iProxCMPG presented in the paper

Philip Jordan, Anas Barakat and Niao He. "Independent Learning in Constrained Markov Potential Games." International Conference on Artificial Intelligence and Statistics. PMLR, 2024.

To cite our work, please use the following BibTeX entry:

@InProceedings{jordan2024independent,
  title={Independent Learning in Constrained Markov Potential Games},
  author={Jordan, Philip and Barakat, Anas and He, Niao},
  booktitle={Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  publisher={PMLR},
  year={2024}
}

In the simulations section of our paper, we consider two constrained multi-player environments, both of which are inspired by unconstrained variants presented in Narasimha et al. (2022):

  • a demand-response marketplace for energy grids, and
  • a pollution tax model.

Parts of our implementation, e.g., estimation of value function gradients and projection onto the policy space, are based on the code provided by Leonardos et al. (2022) for their paper on "Global convergence of multi-agent policy gradient in Markov potential games". Respective sections are marked within our code.

Instructions

The code was tested using python 3.9.2. To run the simulations, first install the required packages:

pip install -r requirements.txt

Then, execute the run script which will start 10 independent runs with respective seeds for each of the presented experiments:

./run_simulations.sh

Results are stored in the experiments directory. Since some runs may take up to a few hours on consumer-grade CPUs, we recommend executing multiple runs in parallel. In our experiments, we used a cluster of 15 4-core CPUs. The script for submitting the respective jobs to a cluster using the slurm scheduler is provided in run_simulations_slurm.sh.

To reproduce the plots (after simulations have terminated and results in experiments are complete) shown in the paper (Fig. 1 and 2), run:

python3 plot.py

If latex is not available on the system, run python3 plot.py --no_latex. The plots will appear as pdf files in the plots directory.

References

  • [Narasimha et al. (2022)] Narasimha, D., Lee, K., Kalathil, D., and Shakkottai, S. (2022). Multi-agent learning via markov potential games in marketplaces for distributed energy resources. In 2022 IEEE 61st Conference on Decision and Control (CDC), pages 6350–6357. IEEE.
  • [Leonardos et al. (2022)] Leonardos, S., Overman, W., Panageas, I., and Piliouras, G. (2022). Global convergence of multiagent policy gradient in markov potential games. In International Conference on Learning Representations.

About

Simulations code for AISTATS'24 paper on "Independent Learning in Constrained Markov Potential Games"


Languages

Language:Python 96.8%Language:Shell 3.2%