Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

The reference code of Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation.

Implemented Models

CNN-RNN-RNN (Liu et al., 2019)
Knowing When to Look (Lu et al., 2017)
Meshed-Memory Transformer (Cornia et al., 2020)
Show, Attend and Tell (Xu et al., 2015)
TieNet (Wang et al., 2018)

Supported Radiology Report Datasets

MIMIC-CXR-JPG (Johnson et al., 2019)
Open-i (Demner-Fushman et al., 2012)

Radiology NLI Dataset

NOTE : We are working to make the radiology NLI dataset publicly available.

Prerequisites

A Linux OS (tested on Ubuntu 16.04)
Memory over 24GB
A gpu with memory over 12GB (tested on NVIDIA Titan X and NVIDIA Titan XP)

Preprocesses

Python Setup

Create a conda environment

$ conda env create -f environment.yml

NOTE : environment.yml is set up for CUDA 10.1 and cuDNN 7.6.3. This may need to be changed depending on a runtime environment.

Resize MIMIC-CXR-JPG

Download MIMIC-CXR-JPG
Make a resized copy of MIMIC-CXR-JPG using resize_mimic-cxr-jpg.py (MIMIC_CXR_ROOT is a dataset directory containing mimic-cxr)
- $ python resize_mimic-cxr-jpg.py MIMIC_CXR_ROOT
Create the sections file of MIMIC-CXR (mimic_cxr_sectioned.csv.gz) with create_sections_file.py
Move mimic_cxr_sectioned.csv.gz to MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/

Compute Document Frequencies

Pre-calculate document frequencies that will be used in CIDEr by:

$ python cider-df.py MIMIC_CXR_ROOT mimic-cxr_train-df.bin.gz

Recognize Named Entities

Pre-recognize named entities in MIMIC-CXR by:

$ python ner_reports.py --stanza-download MIMIC_CXR_ROOT mimic-cxr_ner.txt.gz

Pre-train CheXpert Image Weights

Download CheXpert Dataset v1.0
Train a CheXpert classification model by:

$ python train_image.py --cuda --epochs 12 --batch-size 16 --eval-interval 65000 --cache-data cache CheXpert-v1.0-small densenet chexpert_densenet

Download Pre-trained Weights

Download pre-trained radiology NLI weights and GloVe embeddings

$ cd resources
$ ./download.sh

Training a Report Generation Model

First, train the Meshed-Memory Transformer model with an NLL loss.

# NLL
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --cider-df mimic-cxr_train-df.bin.gz --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained chexpert_densenet/model_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll

Second, further train the model a joint loss using the self-critical RL to achieve a better performance.

# RL with NLL + BERTScore + EntityMatchExact
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchExact --rl-weights 0.01,0.495,0.495 --entity-match resources/mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained chexpert_densenet/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emexact

# RL with NLL + BERTScore + EntityMatchNLI
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchNLI --rl-weights 0.01,0.495,0.495 --entity-match resources/mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained chexpert_densenet/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emnli

Checking Result with TensorBoard

A training result can be checked with TensorBoard.

$ tensorboard --logdir out_m2trans_nll-bs-emnli/log
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.0.0 at http://localhost:6006/ (Press CTRL+C to quit)

Modification from the original code

In this project we mainly modify the code as two reasons. First we would like to make the model support customized mimic-cxr-jpg dataset since we have no access to the official dataset yet. Second is that we would like to fit the model with IU XRAY dataset. The modified py files are showed as below:

train.py 
train_images.py
clinicgen/eval.py 
clinicgen/utils.py
clinicgen/data/mimiccxr.py
clinicgen/data/iuxray.py

Followed the original IFCC author, we hard code the number of used images in clinicgen/data/mimiccxr.py and clinicgen/data/iuxray.py at the beginning of these files. If you would like to use different dataset, you may modify the numbers

The prject should be able to fit the datasets from below:

https://github.com/cuhksz-nlp/R2Gen

Licence

See LICENSE and clinicgen/external/LICENSE_bleu-cider-rouge-spice for details.

TCBpenta8 / ifcc-1