Master's thesis code repository (Aalto University)
Benchmarking framework for nuclei segmentation models. Using encoders from: segmentation_models_pytorch Borrowing functions and utilities from HoVer-Net repository
- Easy model building and training with single .yml file.
- Image patching
- Inference
- Benchmarking
- Kumar (Kumar et al.)
- CoNSep (Graham, Vu, et al.)
- Pannuke (Gamper et al. Note: License)
- MoNuSac (coming)
- Clone the repository
- cd to the repository
cd <path>/Dippa/
- Create environment (optional but recommended)
conda create --name DippaEnv python=3.6
conda activate DippaEnv
or
python3 -m venv DippaEnv
source DippaEnv/bin/activate
pip install -U pip
- Install dependencies
pip install -r requirements.txt
- modify the experiment.yml file
- Train the model. (See the notebooks)
# Insert the model to pytorch lightning
config = CONFIG
lightning_model = lightning.SegModel.from_conf(config)
# init the trainer and optional callbacks
extra_callbacks = [] # add lightning callbacks in this list
trainer = lightning.SegTrainer.from_conf(config, extra_callbacks=extra_callbacks)
# Use the pannuke dataset
pannuke = PannukeDataModule(
database_type="hdf5",
augmentations=["hue_sat", "non_rigid", "blur"],
normalize=False,
)
# Train
trainer.fit(model=lightning_model, datamodule=pannuke)
experiment_args:
experiment_name: my_experiment
experiment_version: version2
dataset_args:
n_classes: 6
model_args:
architecture_design:
module_args:
activation: leaky-relu # One of (relu, mish, swish, leaky-relu)
normalization: bn # One of (bn, bcn, gn, nope)
weight_standardize: False # Weight standardization
weight_init: he # One of (he, TODO: eoc)
encoder_args:
in_channels: 3 # RGB input images
encoder: efficientnet-b5 # https://github.com/qubvel/segmentation_models.pytorch
pretrain: True # Use imagenet pre-trained encoder
depth: 5 # Number of layers in encoder
decoder_args:
n_layers: 1 # Num multi conv blocks in one decoder level
n_blocks: 2 # Num of conv blocks inside a multi conv block
preactivate: False # If True, BN & RELU applied before CONV
short_skips: null # One of (residual, dense, null) (decoder)
long_skips: unet # One of (unet, unet++, unet3+, null)
merge_policy: concatenate # One of (summation, concatenate) (long skips)
upsampling: fixed_unpool # One of (fixed_unpool) TODO: interp, transconv
decoder_channels: # Num of out channels for the decoder layers
- 256
- 128
- 64
- 32
- 16
decoder_branches:
type_branch: True
aux_branch: hover # One of (hover, dist, contour, null)
training_args:
freeze_encoder: False # freeze the weights in the encoder
weight_balancing: null # TODO: One of (gradnorm, uncertainty, null)
input_args:
normalize_input: False # minmax normalize input images after augs
rm_overlaps: False # Remove overlapping nuclei borders from masks
edge_weights: True # Compute nuclei border weight maps for each input
augmentations:
- hue_sat
- non_rigid
- blur
optimizer_args:
optimizer: radam # https://github.com/jettify/pytorch-optimizer
lr: 0.0005
encoder_lr: 0.00005
weight_decay: 0.0003
encoder_weight_decay: 0.00003
lookahead: True
bias_weight_decay: True
scheduler_factor: 0.25
scheduler_patience: 3
loss_args:
inst_branch_loss: dice_ce
type_branch_loss: dice_ce
aux_branch_loss: mse_ssim
runtime_args:
resume_training: False
num_epochs: 2
num_gpus: 1
batch_size: 4
num_workers: 8 # number workers for data loader
model_input_size: 256 # size of the model input
db_type: hdf5 # One of (hdf5, zarr).
wandb: True # Wandb logging
metrics_to_cpu: True # Lowers GPU memory but training gets slower.
- Run inference script. (See notebooks)
from src.dl.inference.inferer import Inferer
import src.dl.lightning as lightning
from src.config import CONFIG
in_dir = "my_input_dir" # input directory for the image files
gt_dir = "my_gt_dir" # This is optional. Can be None. Used for benchmarking
exp_name = "my_experiment" # name of the experiment (directory)
exp_version = "dense_skip_test" # name of the experiment version (sub directory inside the experiment dir)
lightning_model = lightning.SegModel.from_experiment(name=exp_name, version=exp_version)
inferer = Inferer(
lightning_model,
in_data_dir=in_dir,
gt_mask_dir=gt_dir,
patch_size=(256, 256),
stride_size=80,
fn_pattern="*",
model_weights="last",
apply_weights=True,
post_proc_method="cellpose",
loader_batch_size=1,
loader_num_workers=1,
model_batch_size=16,
)
inferer.run_inference(
save_dir=".../geojson",
fformat="geojson",
offsets=True
)
- [1] N. Kumar, R. Verma, S. Sharma, S. Bhargava, A. Vahadane and A. Sethi, "A Dataset and a Technique for Generalized Nuclear Segmentation for Computational Pathology," in IEEE Transactions on Medical Imaging, vol. 36, no. 7, pp. 1550-1560, July 2017
- [2] S. Graham, Q. D. Vu, S. E. A. Raza, A. Azam, Y-W. Tsang, J. T. Kwak and N. Rajpoot. "HoVer-Net: Simultaneous Segmentation and Classification of Nuclei in Multi-Tissue Histology Images." Medical Image Analysis, Sept. 2019.
- [3] Q D Vu, S Graham, T Kurc, M N N To, M Shaban, T Qaiser, N A Koohbanani, S A Khurram, J Kalpathy-Cramer, T Zhao, R Gupta, J T Kwak, N Rajpoot, J Saltz, K Farahani. Methods for Segmentation and Classification of Digital Microscopy Tissue Images. Frontiers in Bioengineering and Biotechnology 7, 53 (2019).
- [4] Gamper, Jevgenij, Navid Alemi, Koohbanani, Simon, Graham, Mostafa, Jahanifar, Syed Ali, Khurram, Ayesha, Azam, Katherine, Hewitt, and Nasir, Rajpoot. "PanNuke Dataset Extension, Insights and Baselines".arXiv preprint arXiv:2003.10778 (2020).
- [5] Gamper, Jevgenij, Navid Alemi, Koohbanani, Ksenija, Benet, Ali, Khuram, and Nasir, Rajpoot. "PanNuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification." . In European Congress on Digital Pathology (pp. 11–19).2019.
- [6] Caicedo, J.C., Goodman, A., Karhohs, K.W. et al. Nucleus segmentation across imaging experiments: the 2018 Data Science Bowl. Nat Methods 16, 1247–1253 (2019)