This repo provides the source code of our paper: MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents. [PDF][Twitter][Demo] If you discuss or use MLR-Copilot in your research, please cite us!
@misc{li2024mlrcopilotautonomousmachinelearning,
title={MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents},
author={Ruochen Li and Teerth Patel and Qingyun Wang and Xinya Du},
year={2024},
eprint={2408.14033},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2408.14033},
}
MLR-Copilot is a framework where LLMs mimic researchers’ thought processes, designed to enhance the productivity of machine learning research by automating the generation and implementation of research ideas.
It begins with a research paper, autonomously generating and validating these ideas, while incorporating human feedback to help reach executable research outcomes.
Demo (Link)
demo_rec_compress.mov
MLR-Copilot operates in three integrated phases:
- Research Idea Generation: LLM-powered agents generate research hypotheses and experimental plans based on existing research papers.
- Experiment Implementation: Translates experimental plans into executable experiments using retrieved prototype code and models.
- Implementation Execution: Runs the experiments with mechanisms for human feedback and iterative debugging. Figure 1: The autonomous machine learning research task. We take the research paper as input and output the research idea (i.e., research hypothesis and experiment plan) with execution results.
Figure 2: Our MLR-Copilot Framework. LLM IdeaAgent (leftmost grey component) performs research idea generation, including hypothesis and experimental design (Stage 1). ExperimentAgent implements and executes the experiments.
Begin by cloning this repository.
-
Place the following in a
.env
file at the root of this project:CLAUDE_API_KEY
OPENAI_API_KEY
-
Configure the Hugging Face Token as needed so that
huggingface_hub.login()
works if you intend to use Llama.
- Install requirements:
pip install -r requirements.txt
- Obtain the Docker image
tortcode/nlp-coresearcher
:- Build:
docker build . -t 'tortcode/nlp-coresearcher'
- Or pull from Docker Hub:
docker pull 'tortcode/nlp-coresearcher'
- Build:
- Run
bash container.sh
to start the container.
- Place the research idea in the file
problems/<task_name>
. - Run any preparation scripts as needed.
- Place all starter code in the directory
workspaces/<task_name>
.
- To run the agent with a specific task and LLM (Claude, GPT-4, or Llama), execute
bash run_demo.sh <task_name> <llm_name>
.- You must have access to the Meta Llama 3.1 models in Hugging Face to run Llama.
- To ignore error logging, redirect stderr to
/dev/null
:bash run_demo.sh <task_name> <llm_name> 2>/dev/null
.
- Full logs are under
logs/<task_name>/<start_timestamp>/agent_log/full_log.jsonl
. - Other logs are under
logs/<task_name>/<start_timestamp>/env_log/
.
MLR-Copilot is adapted from MLAgentBench, under the MIT License.
Some components are adapted from Prompt2Model, under the Apache License 2.0. Files utilizing API calls have been modified.