adamshephard / OMTscoring_inference

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OMTscoring: A Fully Automated and Explainable Algorithm for the Prediction of Malignant Transformation in Oral Epithelial Dysplasia

This repository provides the inference code for the models used for predicting slide-level malignancy transformation in OED. Link to preprint here.

The first step in this pipeline is to use HoVer-Net+ (see original paper here) to segment the epithelium and nuclei. We have used the TIAtoolbox (see paper here) implementation of HoVer-Net in the below scripts. Next, we generate patch-level morphological and spatial features to use in our OMTscoring pipeline. After this, we perform the OMTscoring using our pre-trained MLP model.

Set Up Environment

We use Python 3.10 with the tiatoolbox package installed. By default this uses PyTorch 2.0.

conda create -n tiatoolbox python=3.10
conda activate tiatoolbox
pip install tiatoolbox
pip install h5py
pip install docopt

Repository Structure

Below are the main directories in the repository:

  • dataloader/: the data loader and augmentation pipeline
  • utils/: scripts for metric, patch generation
  • models/: model definition

Below are the main executable scripts in the repository:

  • run_segmentation.py: hovernetplus inference script
  • create_features.py: script to generate features for the final MLP model (using output from above script)
  • h5_bag2tiles.py: script to get features into the correct format
  • run_omt_scoring.py: main inference script for OMTscoring
  • create_heatmaps.py: script to generate heatmaps (need tidying up)

Inference

Data Format

Input:

  • WSIs supported by OpenSlide, including svs, tif, ndpi and mrxs.

Model Weights

The MLP model weights obtained from training on the Sheffield OED dataset: OED MLP checkpoint. If the model/checkpoint is used, please ensure to cite the corresponding paper.

Usage

Segmentation with HoVer-Net+

The first stage is to run HoVer-Net+ on the WSIs to generate epithelial and nuclei segmentations. This can be quite slow as run at 0.5mpp.

Usage:

  python run_segmentation.py --input_dir="/path/to/input/slides/or/images/dir/" --output_dir="/path/to/output/dir/"

Feature Generation

The second stage is to tesselate the image into smaller patches and generate correpsonding patch-level morphological and spatial features using the nuclei/layer segmentations. Note the hovernetplus_dir is the output directory from the previous step.

Usage:

  python create_features.py --input_dir="/path/to/input/slides/or/images/dir/" --hovernetplus_dir="/path/to/hovernetplus/output/" --output_dir="/path/to/output/feature/dir/"

We then need to adjust the patch output to be in the right format (one file per tile). We can this using the following script. Here, the input directory is the bag-level nuclear features created by the previous line, e.g. features/0.5-mpp_512_256_epith-0.5/nuclear/h5_files/.

Usage:

  python h5_bag2tiles.py --input_dir="/path/to/input/bag/features/" --output_dir="/path/to/output/tile/features/"

OMTscoring Inference

The final stage is to infer using the MLP on the tiles (and their features) generated in the previous steps. Here, the input_ftrs_dir is the directroy containnig the features created in the previous steps. The model_checkpoint path is tot he weights provided above, and the input_data_file is the path to the data file describing the slides to process. An example file is provided in data_file_template.csv.

Usage:

  python run_omt_scoring.py --input_data_file="/path/to/input/data/file/" --input_ftrs_dir="/path/to/input/tile/ftrs/" --model_checkpoint="/path/to/model/checkpoint/" --output_dir="/path/to/output/dir/"

OMTscore Heatmaps

We can also generate heatmaps for these images. Change the stride within the file from 128 to create smoother images. However, a decreased stride by 2X will increase the processing time by 2X.

Usage:

  python create_heatmaps.py --input_dir="/path/to/input/slides/or/images/dir/" --hovernetplus_dir="/path/to/hovernetplus/output/" --checkpoint_path="/path/to/checkpoint/" --output_dir="/path/to/heatmap/output/dir/"

About


Languages

Language:Python 100.0%