Outliers1106 / SIFLoc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SIFLoc

Contents

SIFLoc is a two-stage model which contains self-supervised pre-training stage and supervised learning stage. The aim of SIFLoc is to promote the performance on recognition of protein subcellular localization in immunofluorescence microscopic images.

Paper: SIFLoc: A self-supervised pre-training method for enhancing the recognition of protein subcellular localization in immunofluorescence microscopic images

The overall network architecture of SIFLoc is shown in original paper.

Original dataset is from Human Protein Atlas (www.proteinatlas.org). After post-processing, we obtain a custom dataset including 4 parts ( link1, link2, link3, link4) which is about 6.5GB.

  • Dataset size: 173,594 color images (512$\times$512), 13,261 bags in 27 classes
  • Data format: RGB images.
    • Note: Data will be processed in src/datasets.py
  • Directory structure of the dataset:
  .hpa
  ├── ENSG00000001167
  │   ├──686_A3_2_blue_red_green.jpg_1.jpg
  │   ├──686_A3_2_blue_red_green.jpg_2.jpg
  │   ├──686_A3_2_blue_red_green.jpg_3.jpg
  │   ├──......
  ├── ENSG00000001630
  ├── ENSG00000002330
  ├── ENSG00000003756
  ├── ......

After installing MindSpore via the official website, you can start training and evaluation as follows:

  • run on Ascend
    # standalone pre-training
    bash scripts/run_pretrain.sh
    # standalone training
    bash scripts/run_train.sh
    # standalone evaluation
    bash scripts/run_eval.sh
  • run on GPU
    # standalone pre-training
    bash scripts/run_pretrain_gpu.sh
    # standalone training
    bash scripts/run_train_gpu.sh
    # standalone evaluation
    bash scripts/run_eval_gpu.sh

Inside scripts, there are some parameter settings that can be adjusted for pre-training/training/evaluation.

  . SIFLoc
  ├── Readme.md                      # descriptions about SIFLoc
  ├── scripts
  │   ├──run_pretrain.sh             # script to pre-train
  │   ├──run_train.sh                # script to train
  │   ├──run_eval.sh                 # script to eval
  ├── src
  │   ├──RandAugment                 # data augmentation polices
  │   ├──callbacks.py                # loss callback 
  │   ├──config.py                   # parameter configuration
  │   ├──datasets.py                 # creating dataset
  │   ├──eval_metrics.py             # evaluation metrics
  │   ├──loss.py                     # contrastive loss and BCE loss
  │   ├──lr_schedule.py              # learning rate config
  │   ├──network_define_eval.py      # evaluation cell
  │   ├──network_define_pretrain.py  # pre-train cell
  │   ├──network_define_train.py     # train cell
  │   └──resnet.py                  # backbone network
  ├── enhanced.csv                   # labels of hpa dataset
  ├── eval.py                        # evaluation script
  ├── pretrain.py                    # pre-training script
  └── train.py                       # training script

Parameters for both pre-training and training can be set in src/config.py

  • config for pre-training

    # base setting
    "description": "description.",        # description for pre-training   
    "prefix": prefix,                     # prefix for pre-training
    "time_prefix": time_prefix,           # time prefix
    "network": "resnet18",                # network architecture
    "low_dims": 128,                      # the dim of last layer's feature
    "use_MLP": True,                      # whether use MLP
    # save
    "save_checkpoint": True,              # whether save ckpt
    "save_checkpoint_epochs": 1,          # save per <num> epochs
    "keep_checkpoint_max": 2,             # save at most <num> ckpt
    # dataset
    "dataset": "hpa",                     # dataset name
    "bag_size": 1,                        # bag size = 1 for pre-training
    "classes": 27,                        # class number
    "num_parallel_workers": 8,            # num_parallel_workers
    # optimizer
    "base_lr": 0.003,                     # init learning rate
    "type": "SGD",                        # optimizer type
    "momentum": 0.9,                      # momentum
    "weight_decay": 5e-4,                 # weight decay
    "loss_scale": 1,                      # loss scale
    "sigma": 0.1,                         # $\tau$
    # trainer
    "batch_size": 32,                     # batch size
    "epochs": 100,                        # epochs for pre-training
    "lr_schedule": "cosine_lr",           # learning rate schedule
    "lr_mode": "epoch",                   # "epoch" or "step"
    "warmup_epoch": 0,                    # epochs for warming up
  • config for training

    # base setting
    "description": "description.",        # description for pre-training  
    "prefix": prefix,	                    # prefix for training
    "time_prefix": time_prefix,           # time prefix
    "network": "resnet18",                # network architecture
    "low_dims": 128,                      # ignoring this for training
    "use_MLP": False,                     # whether use MLP (False)
    # save
    "save_checkpoint": True,              # whether save ckpt
    "save_checkpoint_epochs": 1,          # save per <num> epochs
    "keep_checkpoint_max": 2,             # save at most <num> ckpt
    # dataset
    "dataset": "hpa",	                    # dataset name
    "bag_size_for_train": 1,              # bag size = 1 for training 
    "bag_size_for_eval": 20,              # bag size = 20 for evaluation
    "classes": 27,                        # class number
    "num_parallel_workers": 8,            # num_parallel_workers
    # optimizer
    "base_lr": 0.0001,                    # init learning rate
    "type": "Adam",                       # optimizer type
    "beta1": 0.5,                         # beta1
    "beta2": 0.999,                       # beta2
    "weight_decay": 0,                    # weight decay
    "loss_scale": 1,                      # loss scale
    # trainer
    "batch_size_for_train": 8,            # batch size for training
    "batch_size_for_eval": 1,             # batch size for evaluation
    "epochs": 20,	                        # training epochs
    "eval_per_epoch": 1,                  # eval per <num> epochs
    "lr_schedule": "cosine_lr",           # learning rate schedule
    "lr_mode": "epoch",                   # "epoch" or "step"
    "warmup_epoch": 0,                    # epochs for warming up

About


Languages

Language:Python 91.5%Language:Shell 8.5%