wangleihitcs/ThoraxDiseaseClassification

Intro

A multi-label-classification model for chest diseases.

It is all of common tookits, so I don't give their links.

NIH Chest X-ray Dataset(kaggle's download link)
- you need copy 'Data_Entry_2017.csv' to dir 'data/'
- you need unzip 'images_001.zip' - 'images_012.zip' to 'data/images'
- you need copy 'train_val_list.txt' and 'test_list.txt' to 'data/'
Pretrain VGG19 model
- you need to download vgg_19_2016_08_28.tar.gz
- then extract it, copy 'vgg_19.ckpt' to 'data/pretrain_vgg/'

get 'data_entry.json' and 'data_label.json'

$ cd preprocess
$ python get_data_entry.py

get 'data/tfrecord/train-xx.tfrecord', 'data/tfrecord/test-xx.tfrecord', 'train_tfrecord_name.txt' and 'test_tfrecord_name.txt'
```
$ python datasets.py    
```

I will release a demo.py, you can use it to test.

you could provide Chest CT image to test

$ python demo.py --img='data/examples/CXR3_IM-1384-1001.png'

At last, I trained 100 epoch and the train mlc_loss_weighted reduce to 0.0455, it wasted 36 hours. You can see detials in 'data/log.txt'.

When epoch = 20, iter = 28000, I eval the auc. Actually, when epoch > 15, the model is overfitting, so you don't need trian too many epoch.

Wang, Xiaosong, et al. "Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases." Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 2017.
Wang, Xiaosong, et al. "Tienet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

A multi-label-classification model for common thorax disease.

Language:Python 100.0%