gitxchen / BiomedGPT

BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BiomedGPT

BiomedGPT is developed based on OFA but pre-trained and fine-tuned with multi-modal & multi-task biomedical datasets. Details are shown in datasets.md. Feel free to contact us or post issues.

Please kindly note that this repository is still a work in progress. I am currently occupied with several important tasks πŸ₯΅ πŸ’», but I will do my best to complete the main body by June 14 after finishing my current commitments. Thank you for your understanding and patience. The following is my plan based on my recent schedule:

  • June 9: release pre-trained checkpoints; release the data preprocessing and fine-tuning codes for VQA and Captioning.
  • June 12: release the data preprocessing and fine-tuning codes for NLI and text summarization.
  • June 13: release the data preprocessing and fine-tuning codes for image classification; release data preprocessing scripts for pretraining.
  • June 14: release pretraining codes.

Checkpoints

We provid pretrained checkpoints of BiomedGPT (Dropbox), which can be put in the scripts/ folder for further development. For finetuned checkpoints, please refer to checkpoints.md (coming soon).

Installation

git clone https://github.com/taokz/BiomedGPT
conda env create -f biomedgpt.yml
python -m pip install pip==21.2.4
pip install fairseq



Implementation

We provide the preprocessing, pretraining, finetuning and inference scripts in the scripts/ folder. You can follow the directory setting below:

BiomedGPT/
β”œβ”€β”€ checkpoints/
β”œβ”€β”€ datasets/
β”‚   β”œβ”€β”€ pretraining/
β”‚   β”œβ”€β”€ finetuning/
β”‚   └── ...
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ preprocess/
β”‚   β”‚   β”œβ”€β”€ pretraining/
β”‚   β”‚   └── finetuning/
β”‚   β”œβ”€β”€ pretrain/
β”‚   β”œβ”€β”€ vqa/
β”‚   └── ...
└── ...

Pretraining

Please follow datasets.md to prepare pretraining datasets, which includes 4 TSV files: vision_language.tsv, text.tsv, image.tsv and detection.tsv in the directory of ./datasets/pretraining/.

cd scripts/pretraining
bash pretrain_base.sh

Feel free to modify the hyperparameters in the bash script for your requirements or ablation study.

Downstreams

We provide the run scripts of fine-tuning and inference. There will be log files during execution. Before fine-tuning or inference, please refer to

Visual Question Answering
cd scripts/vqa
# for fine-tuning
bash train_vqa_rad_beam.sh
# for inference
bash evaluate_vqa_rad_beam.sh
Image Captioning
cd scripts/caption
# for fine-tuning
bash train_peir_gross.sh
# for inference
bash evaluate_peir_gross.sh
Text Summarization
cd scripts/text_sum
# for fine-tuning
bash train_meqsum.sh
# for inference
bash evaluate_meqsum.sh
Natural Language Inference
cd scripts/mednli
# for fine-tuning
bash train_mednli.sh
# for inference
bash evaluate_mednli.sh



Related Codebase

Citation

If you use BiomedGPT model or our code for publications, please cite πŸ€—:

@misc{zhang2023biomedgpt,
      title={BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks}, 
      author={Kai Zhang and Jun Yu and Zhiling Yan and Yixin Liu and Eashan Adhikarla and Sunyang Fu and Xun Chen and Chen Chen and Yuyin Zhou and Xiang Li and Lifang He and Brian D. Davison and Quanzheng Li and Yong Chen and Hongfang Liu and Lichao Sun},
      year={2023},
      eprint={2305.17100},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}



About

BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks

License:Apache License 2.0


Languages

Language:Python 95.2%Language:Shell 3.4%Language:Cuda 0.7%Language:C++ 0.4%Language:Cython 0.2%Language:Lua 0.1%Language:Batchfile 0.0%Language:Makefile 0.0%