FourthBrainBreastCancer

This is our final project for Fourth Brain

Cancer Map

Libraries

Stage	Libraries
Prototyping	Pandas, Numpy
WSI Tools	OpenSlide
DL	Tensorflow
API	FastAPI
Front End	Dash?

Sources

White Papers

A Comprehensive Review for Breast Histopathology Image Analysis Using Classical and Deep Neural Networks
A Fast and Refined Cancer Regions Segmentation Framework in Whole-slide Breast Pathological Images
Assessment of Breast Cancer Histology using Densely Connected Convolutional Networks
A Unified Framework for Tumor Proliferation Score Prediction in Breast Histopathology
Deep Learning for Identifying Metastatic Breast Cancer
Detecting Cancer Metastases on Gigapixel Pathology Images - Google 2017t
Multi-Stage Pathological Image Classification using Semantic Segmentation

Other Works

Current development / How to use :

base_directory/dataset_folder

base_directory
├── dataset_folder
    ├── training
    │   ├── lesion_annotations
    │   │   └── tumor_001.xml
    │   ├── normal
    │   │   └── normal_001.tif
    │   └── tumor
    │       └── tumor_001.tif
    │
    └── testing
        ├── lesion_annotations
        │   └── test_001.xml
        └── images
            └── test_001.tif

Implemented so far:

Generate_tiles.py script :

Takes the normal (negative), tumoral (positive) WSIs and corresponding lesion annotations (xml). And stores the tiles into hdfs files stored in a destination folder.

Note: During the next stage, we will generate augmented tiles that will be used in our training model. In order to add randomness, the tiles generated with generate_tiles.py should be larger than the ones used in read_tiles.py In our case, we generate tiles of 312 x 312. Later that tile will be randomly cropped into a 256 x 256 tile.

Read_tiles.py prepares data for the training and validation process. The generator can be directly plugged into a model.fit() call The data augmentation and color normalization is made at that level

Next steps:

Upstream WSI cleaning in generate_tiles.py to improve the quality of the training set generated
Test dataset generation

prashantksharma / FourthBrainBreastCancer