QUEST-AI: A System for Question Generation, Verification, and Refinement using AI for USMLE-Style Exams

Overview

QUEST-AI is an innovative system designed to generate, verify, and refine USMLE-style exam questions using Large Language Models (LLMs), specifically GPT-4. This system aims to streamline the development of medical exam content, offering a cost-effective and efficient alternative for creating study materials and practice questions for the United States Medical Licensing Examination (USMLE).

Features

Question Generation: Utilizes GPT-4 to generate USMLE-style questions.
Verification: An ensemble of LLMs identifies and flags potentially incorrect questions.
Refinement: GPT-4 refines flagged questions to improve accuracy and validity.

Project Structure

data/: Contains datasets, including both AI-generated and human-generated questions.
manuscript/: Drafts and related documents for the research paper.
notebooks/: Jupyter notebooks for data analysis and evaluation.
src/: Source code for the QUEST-AI system.

Getting Started

Installation

Clone the repository:

git clone https://github.com/som-shahlab/gpt4usmle.git

Usage

To do inference by ensemble of LLMs, run:

bash inference_single_model.sh

To fix or refine incorrect questions generated by GPT-4, run:

python fix_incorrect_questions.py

To categotize questions based on the USMLE content categories, run:

python classify_questions.py

About

Tools for automated evaluation and generation of USMLE-style questions

MIT License

Languages

Language:Jupyter Notebook 79.7%Language:Python 17.6%Language:Shell 2.7%