PabloMessina / ihealth_ImageMedicineAI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Medical Vision and NLP at millenium iHEALTH & CENIA

Denis Parra, CS Professor, PUC Chile

Cecilia Besa, Med Professor, PUC Chile
Jocelyn Dunstan, CS Professor, PUC Chile
Mircea Petrache, Math Professor, PUC Chile
Pablo Messina, PhD(c)
Ivania Donoso, PhD
Gregory Schuit, MSc Student, PUC Chile
Jorge Pérez Facuse, MSc Student, PUC Chile
Rafael Elberg, MSc Student, PUC Chile
Cauã Paz, Undergrad Student, PUC Chile
Itan Felzsentein, Undergrad Student, PUC Chile
Valeria Salas, Undergrad Student, PUC Chile

Former members

José Cañete, Engineer, CENIA

Note: Keep it sorted by year

Papers

Report Generation

  • (method) Tanida et al, (2023) Interactive and Explainable Region-guided Radiology Report Generation. CVPR 2023. [[code and links] (https://github.com/ttanida/rgrg)

  • (method BioViL-T) Shruthi Bannur et al (2023) Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing . CVPR 2023. [project page]

  • (method cvt2distilgpt2) Aaron Nicolson, Jason Dowling, and Bevan Koopman (2022) Improving Chest X-Ray Report Generation by Leveraging Warm-Starting, Under review. [code and links]

  • (method) Liu, C. F., Zhao, Y., Miller, M. I., Hillis, A. E., & Faria, A. (2022). Automatic comprehensive radiological reports for clinical acute stroke MRIs. Available at SSRN 4123512. [pdf] [exe]

  • (medical paper) Brady, A. (2022). Language and Radiological Reporting. In Structured Reporting in Radiology (pp. 1-19). Springer, Cham.

  • (method RATCHET) Hou, B., Kaissis, G., Summers, R. M., & Kainz, B. (2021, September). RATCHET: Medical Transformer for Chest X-ray Diagnosis and Reporting. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 293-303). Springer, Cham. [[pdf]] (https://arxiv.org/abs/2107.02104) [code]

  • (method MedViLL) Moon, J. H., Lee, H., Shin, W., & Choi, E. (2021). Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training. arXiv preprint arXiv:2105.11333. [pdf] [code & data]

  • (survey) Messina, P., Pino, P., Parra, D., Soto, A., Besa, C., Uribe, S., ... & Capurro, D. (2022). A survey on deep learning and explainability for automatic report generation from medical images. ACM Computing Surveys (CSUR).

  • (method) Yuan, J., Liao, H., Luo, R., & Luo, J. (2019, October). Automatic radiology report generation based on multi-view image fusion and medical concept enrichment. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 721-729). Springer, Cham. [pdf]

  • (method) Xue, Y., Xu, T., Rodney Long, L., Xue, Z., Antani, S., Thoma, G. R., & Huang, X. (2018, September). Multimodal recurrent model with attention for automated radiology report generation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 457-466). Springer, Cham. [pdf]

  • (method) Miura, Yasuhide and Zhang, Yuhao and Tsai, Emily Bao and Langlotz, Curtis P. and Jurafsky, Dan. (2020). Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation. arXiv preprint arXiv:2010.10042. [pdf] [code]

Medical VQA

  • (method MedViLL) Moon, J. H., Lee, H., Shin, W., & Choi, E. (2021). Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training. arXiv preprint arXiv:2105.11333. [pdf] [code & data]

  • (method CPRD) Liu, B., Zhan, L. M., & Wu, X. M. (2021, September). Contrastive Pre-training and Representation Distillation for Medical Visual Question Answering Based on Radiology Images. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 210-220). Springer, Cham. [pdf]

  • (method MMQ-VQA) Do, T., Nguyen, B. X., Tjiputra, E., Tran, M., Tran, Q. D., & Nguyen, A. (2021, September). Multiple meta-model quantifying for medical visual question answering. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 64-74). Springer, Cham. [code]

  • Eslami, S., de Melo, G., & Meinel, C. (2021). Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain?. arXiv preprint arXiv:2112.13906. [pdf]

  • Vu, M. H., Löfstedt, T., Nyholm, T., & Sznitman, R. (2020). A question-centric model for visual question answering in medical imaging. IEEE transactions on medical imaging, 39(9), 2856-2868. [code] -> relies heavely on MUTAN for visual-text fusion scheme [code]

  • (method QCR) Zhan, L. M., Liu, B., Fan, L., Chen, J., & Wu, X. M. (2020, October). Medical visual question answering via conditional reasoning. In Proceedings of the 28th ACM International Conference on Multimedia (pp. 2345-2354). [pdf] [code]

  • (method CGMVQA) Ren, F., & Zhou, Y. (2020). Cgmvqa: A new classification and generative model for medical visual question answering. IEEE Access, 8, 50626-50636. [paper] [code]

  • (method MEVF) Nguyen, B. D., Do, T. T., Nguyen, B. X., Do, T., Tjiputra, E., & Tran, Q. D. (2019, October). Overcoming data limitation in medical visual question answering. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 522-530). Springer, Cham. [pdf] [code] [data]

  • Many medical VQA works are based on [Bilinear Attention Networks] and its implemented pytorch [code]

XAI

  • Lipton, Z. C. (2018). The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 16(3), 31-57. [pdf]

  • Tonekaboni, S., Joshi, S., McCradden, M. D., & Goldenberg, A. (2019, October). What clinicians want: contextualizing explainable machine learning for clinical end use. In Machine learning for healthcare conference (pp. 359-380). PMLR. [pdf]

  • Weber, L., Lapuschkin, S., Binder, A., & Samek, W. (2022). Beyond Explaining: Opportunities and Challenges of XAI-Based Model Improvement. arXiv preprint arXiv:2203.08008. [pdf]

  • (NLP) Jain, S., & Wallace, B. C. (2019, June). Attention is not Explanation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 3543-3556). [pdf]

  • (NLP) Wiegreffe, S., & Pinter, Y. (2019). Attention is not not explanation. arXiv preprint arXiv:1908.04626. [pdf]

  • Lee, K. H., Park, C., Oh, J., & Kwak, N. (2021). LFI-CAM: Learning Feature Importance for Better Visual Explanation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 1355-1363). [code]

  • Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision (pp. 618-626). [pdf]

  • Counterfactual Generation:

    • Verma, S., Dickerson, J., & Hines, K. (2020). Counterfactual explanations for machine learning: A review. arXiv preprint arXiv:2010.10596. [pdf]

    • Graham Spinks, Marie-Francine Moens. Justifying diagnosis decisions by deep neural networks. Journal of Biomedical Informatics Volume 96, August 2019, 103248. [pdf]

    • Sanchez, P., & Tsaftaris, S. A. (2022). Diffusion Causal Models for Counterfactual Estimation. arXiv preprint arXiv:2202.10166. [pdf]

    • Van Looveren, A., Klaise, J., Vacanti, G., & Cobb, O. (2021). Conditional generative models for counterfactual explanations. arXiv preprint arXiv:2101.10123. [pdf]

    • Thiagarajan, J., Narayanaswamy, V. S., Rajan, D., Liang, J., Chaudhari, A., & Spanias, A. (2021). Designing counterfactual generators using deep model inversion. Advances in Neural Information Processing Systems, 34, 16873-16884. [pdf]

    • Nemirovsky, D., Thiebaut, N., Xu, Y., & Gupta, A. (2020). Countergan: generating realistic counterfactuals with residual generative adversarial nets. arXiv preprint arXiv:2009.05199. [pdf]

    • Chang, C. H., Adam, G. A., & Goldenberg, A. (2021). Towards robust classification model by counterfactual and invariant data generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 15212-15221). [pdf]

Text Groundig for Visual Tasks

  • Petryk, S., Dunlap, L., Nasseri, K., Gonzalez, J., Darrell, T., & Rohrbach, A. (2022). On Guiding Visual Attention with Language Specification. arXiv preprint arXiv:2202.08926.

  • Ross, A. S., Hughes, M. C., & Doshi-Velez, F. (2017, January). Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. In IJCAI.

Multimodal & Contrastive Learning

  • Taleb, A., Kirchler, M., Monti, R., & Lippert, C. (2021). ContIG: Self-supervised Multimodal Contrastive Learning for Medical Imaging with Genetics. arXiv preprint arXiv:2111.13424.

  • (method) Eslami, S., de Melo, G., & Meinel, C. (2021). Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain?. arXiv preprint arXiv:2112.13906. [pdf] [code]

  • (method CPRD) Liu, B., Zhan, L. M., & Wu, X. M. (2021, September). Contrastive Pre-training and Representation Distillation for Medical Visual Question Answering Based on Radiology Images. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 210-220). Springer, Cham. [pdf]

  • (method CMSA) Gong, H., Chen, G., Liu, S., Yu, Y., & Li, G. (2021, August). Cross-Modal Self-Attention with Multi-Task Pre-Training for Medical Visual Question Answering. In Proceedings of the 2021 International Conference on Multimedia Retrieval (pp. 456-460). [pdf] [code]

  • (method PCRL) Zhou, H. Y., Lu, C., Yang, S., Han, X., & Yu, Y. (2021). Preservational Learning Improves Self-supervised Medical Image Models by Reconstructing Diverse Contexts. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3499-3509). [pdf] [code]

  • (comparison of Dual Encoders and Joint Encoders) Hendricks, L. A., Mellor, J., Schneider, R., Alayrac, J. B., & Nematzadeh, A. (2021). Decoupling the role of data, attention, and losses in multimodal transformers. Transactions of the Association for Computational Linguistics, 9, 570-585. [pdf]

Graph Neural Networks & Knowledge-Aware Methods

  • Agu, N. N., Wu, J. T., Chao, H., Lourentzou, I., Sharma, A., Moradi, M., ... & Hendler, J. (2021, September). Anaxnet: Anatomy aware multi-label finding classification in chest x-ray. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 804-813). Springer, Cham.

NLP Español

  • Cotik, V., Stricker, V., Vivaldi, J., & Rodríguez Hontoria, H. (2016). Syntactic methods for negation detection in radiology reports in Spanish. In Proceedings of the 15th Workshop on Biomedical Natural Language Processing, BioNLP 2016: Berlin, Germany, August 12, 2016 (pp. 156-165). Association for Computational Linguistics.

  • Stricker, V., Iacobacci, I., & Cotik, V. (2015). Negated findings detection in radiology reports in spanish: an adaptation of negex to spanish. In Ijcai-workshop on replicability and reproducibility in natural language processing: adaptative methods, resources and software, buenos aires, argentina.

Labeler Evaluation

  • Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sandberg, Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. Lungren, and Andrew Y. Ng. (2019). CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison. [pdf] [code]

  • Guanxiong Liu, Tzu-Ming Harry Hsu, Matthew McDermott, Willie Boag, Wei-Hung Weng, Peter Szolovits, and Marzyeh Ghassemi. (2019). Clinically Accurate Chest X-Ray Report Generation. [pdf]

  • Matthew B. A. McDermott, Tzu Ming Harry Hsu, Wei-Hung Weng, Marzyeh Ghassemi, and Peter Szolovits. (2020). Chexpert++: Approximating the chexpert labeler for speed,differentiability, and probabilistic output. [pdf] [code]

  • Akshay Smit, Saahil Jain, Pranav Rajpurkar, Anuj Pareek, Andrew Y Ng, and Matthew P Lungren. (2020). Chexbert: Combining automatic labelers and expert annotations for accurate radiology report labeling using bert. [pdf] [code]

  • Saahil Jain, Akshay Smit, Steven QH Truong, Chanh DT Nguyen, Minh-Thanh Huynh, Mudit Jain, Victoria A. Young, Andrew Y. Ng, Matthew P. Lungren, and Pranav Rajpurkar. (2021). VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels. [pdf] [code]

  • Saahil Jain, Akshay Smit, Andrew Y. Ng, and Pranav Rajpurkar. (2021). Effect of Radiology Report Labeler Quality on Deep Learning Models for Chest X-Ray Interpretation. [pdf]

  • Yifan Peng, Xiaosong Wang, Le Lu, Mohammadhadi Bagheri, Ronald Summers, and Zhiyong Lu. (2017). NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. [pdf] [code]

  • Joy T. Wu, Ali Syed, Hassan Ahmad, Anup Pillai, Yaniv Gur, Ashutosh Jadhav, Daniel Gruhl, Linda Kato, Mehdi Moradi, & Tanveer Syeda-Mahmood. (2021). AI Accelerated Human-in-the-loop Structuring of Radiology Reports. [pdf]

Datasets

  • VQA-RAD dataset: A manually constructed VQA dataset in radiology. 315 images and 3515 visual questions. 104 head axial single-slice CTs or MRIs, 107 chest x-rays, and 104 abdominal axial CTs. The VQA-RAD test set contains 151 matched pairs of free-form and paraphrased questions. Download link

  • SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering.

  • MIMIC-CXR

  • Open IU

  • Radiology objects in context roco: a multimodal image dataset, provides over 80K samples of ultrasound, X-Ray, fluoroscopy, PET scans, mammography, MRI, angiography, from various human body regions, e.g., head, neck, jaw and teeth, spine, chest, abdomen, hand, foot, knee, and pelvis. The image–text pairs in this dataset are captured from PubMed articles. The texts here are taken from the relatively short captions (average length of 20 words) associated with images in the articles, which provide rich explanatory information about the content of images.

  • Chest ImaGenome: A joint rule-based natural language processing (NLP) and CXR atlas-based bounding box detection pipeline are used to automatically label 242072 frontal MIMIC CXRs locally. Contributions: 1) 1,256 combinations of relation annotations between 29 CXR anatomical locations (objects with bounding box coordinates) and their attributes, structured as a scene graph per image, 2) over 670,000 localized comparison relations (for improved, worsened, or no change) between the anatomical locations across sequential exams, as well as 3) a manually annotated gold standard scene graph dataset from 500 unique patients. [data] [pdf]

  • VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations. A dataset of more than 100,000 chest X-ray scans that were retrospectively collected from two major hospitals in Vietnam. Out of this raw data, we release 18,000 images that were manually annotated by a total of 17 experienced radiologists with 22 local labels of rectangles surrounding abnormalities and 6 global labels of suspected diseases. The released dataset is divided into a training set of 15,000 and a test set of 3,000. Each scan in the training set was independently labeled by 3 radiologists, while each scan in the test set was labeled by the consensus of 5 radiologists. We designed and built a labeling platform for DICOM images to facilitate these annotation procedures. All images are made publicly available in DICOM format along with the labels of both the training set and the test set. [data (kaggle)] [pdf]

  • CXR Eye Gaze: Creation and Validation of a Chest X-Ray Dataset with Eye-tracking and Report Dictation for AI Tool Development [code] https://github.com/cxr-eye-gaze/eye-gaze-dataset

  • RadReport Template Library https://radreport.org/home/232/2012-07-12%2000:00:00

  • PICAI Dataset: https://pi-cai.grand-challenge.org/DATA/

Code

MEVF (2019) code

About