THU-BPM/LLMArena

This is the initial version of this work. The author plans to refactor the code and release the final version of the code before July 15.

Installation

LLMArena is developed based on the OpenRL framework. Therefore, you need to use LLMArena with the following command:

pip install -e .

Usage

First, for closed-source models, such as ChatGPT, you need to use the methods called by the API to experiment. For closed source models, you need to manually encapsulate the model into Openai API form

export OPENAI_API_KEY=<Your API key here>

Then,

cd examples\selfplay\opponent_templates

and create a folder named LLM to be evaluated that contains opportunity.py and info.json under each environment to be evaluated.

Finally,

cd \examples\arena

and replace lines 123 and 174 of run_arena.py with the model and environment to be evaluated, and then

python run_arena.py

About

Code for paper "LLMARENA: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments" accepted by ACL 2024

Languages

Language:Python 99.8%Language:Shell 0.1%Language:Makefile 0.1%Language:Dockerfile 0.0%