kelenlv / SSL-FL

Self-supervised federated learning for medical imaging

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Self-supervised Federated Learning (SSL-FL)

Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging

ArXiv | Cite

*TL;DR: Pytorch implementation of the self-supervised federated learning framework proposed in our paper for simulating self-supervised classification on multi-institutional medical imaging data using federated learning.

  • Our framework employs masked image encoding as self-supervised task to learn efficient representations from images.
  • Extensive experiments are performed on diverse medical datasets including retinal images, dermatology images and chest X-rays.
  • In particular, we implement BEiT and MAE as the self-supervision learning module.

Pre-requisites:

Set Up Environment

  • conda env create -f environment.yml
  • NVIDIA GPU (Tested on Nvidia Tesla V100 32G x 4, and Nvidia GeForce RTX 2080 Ti x 8) on local workstations
  • Python (3.8.12), torch (1.7.1), numpy (1.21.2), pandas (1.4.2), scikit-learn (1.0.2), scipy (1.7.1), seaborn (0.11.2)

Data Preparation

We will release the data preparation instruction and the data soon.

Retina Derm COVID-FL Skin-FL
Link link TODO TODO TODO

Self-supervised Federated Learning for Medical Image Classification

In this paper, we choose ViT-B/16 as the backbone for all the methods:

BEiT-B: #layer=12; hidden=768; FFN factor=4x; #head=12; patch=16x16 (#parameters: 86M)

The models were pretrained with 224x224 resolution. The following tables provide the pre-trained checkpoints used in the paper.

Self-supervised Federated Pre-training

(i.e., pre-training directly on decentralized target task data)

You can run self-supervised Federated Pre-training on your own datasets with the following python files:

  • Fed-BEiT: beit/run_beit_pretrain_FedAvg.py
  • Fed-MAE: mae/run_mae_pretrain_FedAvg.py

If you want to test on new datasets, please modify datasets.py and FedAvg_utils/data_utils.py

Federated pre-training with Retina

Method Pre-training Data Central Split-1 Split-2 Split-3
Fed-BEiT Retina download download download download
Fed-MAE Retina download download download download

Federated pre-training with COVID-FL

Method Pre-training Data Central Real-world Split
Fed-BEiT COVID-FL download download
Fed-MAE COVID-FL download download

Supervised Pre-training with ImageNet-22k

Download the ViT-B/16 weights pre-trained on ImageNet-22k:

  • wget https://storage.googleapis.com/vit_models/imagenet21k/ViT-B_16.npz

See more details in https://github.com/google-research/vision_transformer.

Self-supervised pre-training with ImageNet-22k

BEiT ImageNet: Download BEiT weights pre-trained on ImageNet-22k:

  • wget https://unilm.blob.core.windows.net/beit/beit_base_patch16_224_pt22k.pth

Download Dall-e tokenizers:

  • wget https://cdn.openai.com/dall-e/encoder.pkl
  • wget https://cdn.openai.com/dall-e/decoder.pkl

MAE ImageNet: Download MAE weights pretrained on ImageNet-22k:

  • wget https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base.pth

Self-supervised Federated Fine-Tuning

You can also run self-supervised Federated Fine-tuning on your own datasets with the following python files:

  • Fed-BEiT: beit/run_class_finetune_FedAvg.py
  • Fed-MAE: mae/run_class_finetune_FedAvg.py

Scripts are in beit/script and mae/script. More details about model training will be added.

Funding

This work was funded by National Institutes of Health (NIH) under grants R01CA256890, R01CA227713, and U01CA242879.

Reference

The current work is on Arxiv and under review. If you find our work helpful in your research or if you use the code or datasets, please consider citing our paper.

Yan, R., Qu, L., Wei, Q., Huang, S.C., Shen, L., Rubin, D., Xing, L. and Zhou, Y., 2022. Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging. arXiv preprint arXiv:2205.08576.

@article{yan2022label,
  title={Label-Efficient Self-Supervised Federated Learning for Tackling Data Heterogeneity in Medical Imaging},
  author={Yan, Rui and Qu, Liangqiong and Wei, Qingyue and Huang, Shih-Cheng and Shen, Liyue and Rubin, Daniel and Xing, Lei and Zhou, Yuyin},
  journal={arXiv preprint arXiv:2205.08576},
  year={2022}
}

Acknowledgements

About

Self-supervised federated learning for medical imaging


Languages

Language:Python 88.4%Language:Shell 11.6%