clip contrastive-learning contrastive-loss mammography report-generation vision-language

MMG-CLIP: Automated Mammography Reporting Through Image-to-Text Translation

Recently medical image-text datasets have gained growing utilization in the development of deep learning applications, including automated radiology report generation models. In this work, we tackle the task of automated mammography report generation following Breast Imaging Reporting & Data System (BI-RADS) guidelines. We utilize an image-label and exam reports datasets, along with text prompting techniques, to generate a well-structured text report that supports training.

Approach

Usage

To encode images and extract features:

python encode_images.py --config-name {config_file_name}

To create and run a classification experiment:

python train.py --config-name {config_file_name}

Example command to run an inference on a trained model:

python evaluate_clip.py --experiment_path "2024-03-18/11-02-49" --run_name "inference"

Example command to evaluate the CNN image encoder only:

python evaluate_cnn.py

For report generation, experiment_path is the experiment folder that contains the model checkpoint

# To generate a report at exam level
python generate_report.py --experiment_path "2024-04-29/10-17-33" --exam_id "0200011002"

# To generate a report at image level
python generate_report.py --experiment_path "2024-04-29/10-17-33" --image_id "p0200011002cl"

To run tensorboard:

tensorboard --logdir=runs

Useful scripts:

python mmgclip/utils/count_report_len.py --file_path "outputs/2024-04-10/15-14-53/image_description.txt"

Generated Reports Examples

About

A vision-language implementation for automated mammography reporting using CLIP (Contrastive Language-Image Pre-Training) neural network.

clip contrastive-learning contrastive-loss mammography report-generation vision-language

Languages

Language:Jupyter Notebook 97.8%Language:Python 2.2%