Dippa

Master's thesis code repository (Aalto University)

Benchmarking framework for nuclei segmentation models. Using encoders from: segmentation_models_pytorch Borrowing functions and utilities from HoVer-Net repository

Contains

Easy model building and training with single .yml file.
Image patching
Inference
Benchmarking

Download scripts for datasets

Kumar (Kumar et al.)
CoNSep (Graham, Vu, et al.)
Pannuke (Gamper et al. Note: License)
MoNuSac (coming)

Set Up

Clone the repository
cd to the repository cd <path>/Dippa/
Create environment (optional but recommended)

conda create --name DippaEnv python=3.6
conda activate DippaEnv

python3 -m venv DippaEnv
source DippaEnv/bin/activate
pip install -U pip

Install dependencies

pip install -r requirements.txt

Training Example

modify the experiment.yml file
Train the model. (See the notebooks)

Training Script

# Insert the model to pytorch lightning
config = CONFIG
lightning_model = lightning.SegModel.from_conf(config)


# init the trainer and optional callbacks
extra_callbacks = [] # add lightning callbacks in this list
trainer = lightning.SegTrainer.from_conf(config, extra_callbacks=extra_callbacks)

# Use the pannuke dataset
pannuke = PannukeDataModule(
    database_type="hdf5",
    augmentations=["hue_sat", "non_rigid", "blur"],
    normalize=False,
)

# Train
trainer.fit(model=lightning_model, datamodule=pannuke)

experiment.yml

experiment_args:
  experiment_name: my_experiment
  experiment_version: version2

dataset_args:
  n_classes: 6

model_args:
  architecture_design:
    module_args:
      activation: leaky-relu    # One of (relu, mish, swish, leaky-relu)
      normalization: bn         # One of (bn, bcn, gn, nope)
      weight_standardize: False # Weight standardization
      weight_init: he           # One of (he, TODO: eoc)
    encoder_args:
      in_channels: 3            # RGB input images
      encoder: efficientnet-b5  # https://github.com/qubvel/segmentation_models.pytorch
      pretrain: True            # Use imagenet pre-trained encoder
      depth: 5                  # Number of layers in encoder
    decoder_args:
      n_layers: 1               # Num multi conv blocks in one decoder level 
      n_blocks: 2               # Num of conv blocks inside a multi conv block
      preactivate: False        # If True, BN & RELU applied before CONV
      short_skips: null         # One of (residual, dense, null) (decoder)
      long_skips: unet          # One of (unet, unet++, unet3+, null)
      merge_policy: concatenate # One of (summation, concatenate) (long skips)
      upsampling: fixed_unpool  # One of (fixed_unpool) TODO: interp, transconv
      decoder_channels:         # Num of out channels for the decoder layers
        - 256
        - 128
        - 64
        - 32
        - 16

  decoder_branches:
    type_branch: True
    aux_branch: hover            # One of (hover, dist, contour, null)

training_args:
  freeze_encoder: False          # freeze the weights in the encoder
  weight_balancing: null         # TODO: One of (gradnorm, uncertainty, null)

  input_args:
    normalize_input: False        # minmax normalize input images after augs
    rm_overlaps: False            # Remove overlapping nuclei borders from masks
    edge_weights: True            # Compute nuclei border weight maps for each input
    augmentations:
      - hue_sat
      - non_rigid
      - blur

  optimizer_args:
    optimizer: radam             # https://github.com/jettify/pytorch-optimizer 
    lr: 0.0005
    encoder_lr: 0.00005
    weight_decay: 0.0003
    encoder_weight_decay: 0.00003
    lookahead: True
    bias_weight_decay: True
    scheduler_factor: 0.25
    scheduler_patience: 3

  loss_args:
    inst_branch_loss: dice_ce
    type_branch_loss: dice_ce
    aux_branch_loss: mse_ssim

runtime_args:
  resume_training: False
  num_epochs: 2
  num_gpus: 1
  batch_size: 4
  num_workers: 8                 # number workers for data loader
  model_input_size: 256          # size of the model input
  db_type: hdf5                  # One of (hdf5, zarr). 
  wandb: True                    # Wandb logging
  metrics_to_cpu: True           # Lowers GPU memory but training gets slower.

Inference Example

Run inference script. (See notebooks)

from src.dl.inference.inferer import Inferer
import src.dl.lightning as lightning
from src.config import CONFIG

in_dir = "my_input_dir" # input directory for the image files
gt_dir = "my_gt_dir" # This is optional. Can be None. Used for benchmarking
exp_name = "my_experiment" # name of the experiment (directory)
exp_version = "dense_skip_test" # name of the experiment version (sub directory inside the experiment dir)
lightning_model = lightning.SegModel.from_experiment(name=exp_name, version=exp_version)

inferer = Inferer(
    lightning_model,
    in_data_dir=in_dir,
    gt_mask_dir=gt_dir,
    patch_size=(256, 256),
    stride_size=80,
    fn_pattern="*",
    model_weights="last",
    apply_weights=True,
    post_proc_method="cellpose",
    loader_batch_size=1,
    loader_num_workers=1,
    model_batch_size=16,
)

inferer.run_inference(
    save_dir=".../geojson",
    fformat="geojson",
    offsets=True
)

References

[1] N. Kumar, R. Verma, S. Sharma, S. Bhargava, A. Vahadane and A. Sethi, "A Dataset and a Technique for Generalized Nuclear Segmentation for Computational Pathology," in IEEE Transactions on Medical Imaging, vol. 36, no. 7, pp. 1550-1560, July 2017
[2] S. Graham, Q. D. Vu, S. E. A. Raza, A. Azam, Y-W. Tsang, J. T. Kwak and N. Rajpoot. "HoVer-Net: Simultaneous Segmentation and Classification of Nuclei in Multi-Tissue Histology Images." Medical Image Analysis, Sept. 2019.
[3] Q D Vu, S Graham, T Kurc, M N N To, M Shaban, T Qaiser, N A Koohbanani, S A Khurram, J Kalpathy-Cramer, T Zhao, R Gupta, J T Kwak, N Rajpoot, J Saltz, K Farahani. Methods for Segmentation and Classification of Digital Microscopy Tissue Images. Frontiers in Bioengineering and Biotechnology 7, 53 (2019).
[4] Gamper, Jevgenij, Navid Alemi, Koohbanani, Simon, Graham, Mostafa, Jahanifar, Syed Ali, Khurram, Ayesha, Azam, Katherine, Hewitt, and Nasir, Rajpoot. "PanNuke Dataset Extension, Insights and Baselines".arXiv preprint arXiv:2003.10778 (2020).
[5] Gamper, Jevgenij, Navid Alemi, Koohbanani, Ksenija, Benet, Ali, Khuram, and Nasir, Rajpoot. "PanNuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification." . In European Congress on Digital Pathology (pp. 11–19).2019.
[6] Caicedo, J.C., Goodman, A., Karhohs, K.W. et al. Nucleus segmentation across imaging experiments: the 2018 Data Science Bowl. Nat Methods 16, 1247–1253 (2019)

okunator / Dippa