This repository contains code necessary to run VRCG model. In this paper, we propose Visual Recalibration and Context Gating-aware model (VRCG) to alleviate visual and textual data bias for enhancing report generation. We employ a medical visual recalibration module to enhance the key lesion feature extraction. We use the context gating-aware module to combine lesion location and report context information to solve the problem of long-distance dependence in diagnostic reports.
torch:1.11.0+cu111
python==3.8
torchvision==0.8.2
opencv-python==4.4.0.42
We use public IU X-Ray datasets in our paper.
For IU X-Ray
, you can download the dataset from here and then put the files in data/iu_xray
.
Dataset | TRAIN | VAL | TEST |
---|---|---|---|
IMAGE# | 5,226 | 748 | 1,496 |
REPORT# | 2,770 | 395 | 790 |
PATIENT# | 2,770 | 395 | 790 |
AVG.LEN | 37.56 | 36.78 | 33.62 |
models.py:This file contains the overall network architecture of VRCG.
utils:This file contains some defined functions.
main_train.py:This file trains the VRCG model.
main_test.py:This file tests the VRCG model.
mvr.py: This file is medical visual recalibration.
Run bash train_iu_xray.sh
to train a model on the IU X-Ray data.
Run bash test_iu_xray.sh
to test a model on the IU X-Ray data.
This work is supported by grant from the Natural Science Foundation of China (No. 62072070)