dtruhn / PathStruct

Code and Data for "Extracting Structured Information from Unstructured Histopathology Reports with GPT-4"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Extracting Structured Information from Unstructured Histopathology Reports with GPT-4


Install

Depending on your setup you have to install the following:
sudo apt-get install tesseract-ocr

Create virtual environment and install packages:
python -m venv venv
source venv/bin/activate
pip install -e .


Get Started

1 ChatGPT API Key

  • create a file secret_api_key.txt in the root folder (next to this file) and copy your secret API key inside

2 Prepare data

  • place all your pdf-files under data/reports

3 Extract text from data

4 Create structured report from text

About

Code and Data for "Extracting Structured Information from Unstructured Histopathology Reports with GPT-4"


Languages

Language:Python 100.0%