Automated VGH Sequence Detector and Processor with included Segmentation Pipeline

This python code is a rule-based NLP interpreter for automatically inferring sequence labels from variable natural text in DICOM files exported from Vancouver General Hospital. The features of this utility are:

Automatically infer sequence types from natural text descriptions present in SeriesDescription tag in DICOM.
Resample images to particular user defined spacing and target shape of the volume.
Convert and save native DICOM files to any SimpleITK supported formats (*nii.gz, *mha, *mhd etc.)

The utility supports automatic detection of the following sequences:

T1
T2
T1CE
T2FLAIR

The assumptions and rules encoded in this utility are available in modules/helpers.py file.

Usage

Dependencies:

[Required] Install python 3 on your system.
- Please refer to the instructions relevant to your system and user privileges.
[Optional] Install virtualenvwrapper.
- Again, refer to official virtualenvwrapper instructions on their website.

Install required packages:

pip install -r requirements.txt

or if you have sudo privileges:

sudo pip install -r requirements.txt

Place your exported DICOM files from VGH into a folder. The default Phillips PACS server folder structure for exported studies is:

<your_internal_path>/IMediaExport/DICOM

Use this as your --path argument.

Invoke the utility using:

python process_vgh_data.py \ 
--path="<path_to_dicom_files>" \
--save_path="default" \
--resample=1 \ 
--target_shape=240x240x155 \
--target_spacing=1x1x1 \
--verbose=0 \
--interpolator=0 \
--out_format="nii.gz"

The command line arguments are:

--path: path to your exported DICOM files. Notice that the path points to the folder INSIDE which the patient folders (PAT_0000...) reside.
--save_path: path to save converted/resampled files.
--resample: whether or not to resample images. Supports resampling to a particular voxel spacing and shape.
--target_shape: target shape if resampling is required, specified in format (WxHxZ).
--target_spacing: target spacing if resampling is required, specified in format (SxSxS).
--verbose: change verbosity level, 0 for info and 1 for debug. Warning: changing this to 1 will output a lot of logging information.
--interpolator: which interpolator to use for resampling:
- (0): BSpline
- (1): NearestNeighbor
- (2): Linear
--out_format: Output format to save files.

Future Work

Perform registration between sequences using a single sequence as template.
Perform automatic skull stripping of sequences.
Provide manual override interface for cases where automatic inference may fail to determine sequences.
Add MM-GAN sequence synthesis feature to synthesize missing sequences.

Application: Segmentation Pipeline

This utility may be used as a preliminary step for any pipeline that requires data to be in a structured and clean format. Typical use cases will be for machine learning/deep learning workflows which requires clean data in the form of sequences of particular shape and voxel spacing. In this project, a fully-working application is included in the form of a segmentation pipeline.

The segmentation pipeline is included in apps/segmentation-pipeline/ca-cnn and uses the open source pretrained models from https://github.com/taigw/brats17 Details regarding how segmentation is performed, and how the ca-cnn code is structured can be found on the original author's repository linked above.

The pipeline can be launched by the command:

bash run_pipeline.sh

However in order to run the pipeline, please ensure:

You have a GPU enabled workstation.
CUDA and CUDNN is installed and working.
Created a new virtualenv using the requirements.txt file INSIDE ca-cnn folder.

The pipeline performs the following steps:

Convert VGH DICOM data into nii.gz data with four sequence (T1, T2, T1CE, T2FLAIR)
Copies the newly converted data into apps/segmentation-pipeline/ca-cnn/data.
Creates a new folder seg_out in apps/segmentation-pipeline/ca-cnn/data.
Prepares configuration file test_names.txt with the testing data present in data folder.
Loads pretrained BRATS2018 segmentation models from model18 into GPU memory.
Loads all data from data.
Runs segmentation algorithm on loaded data.
Write segmentation masks into seg_out folder for each patient.

trane293 / dicom-converter

Automated VGH Sequence Detector and Processor with included Segmentation Pipeline

Usage

Future Work

Application: Segmentation Pipeline

About

Languages