Neuron Image Classification

Overview

This project is on the classification of Neuron images produced by the Psychiatry Department of the University of Oxford. The classification task is to learn to distinguish whether Neurons have been treated with the compound Amyloid- β. The images used are taken after staining with the Cy5 dye, a biomarker for the MAP2 gene found in the Neuronal Cytoskeleton.

Amyloid-β is thought to induce synapse loss. Verifying that this is the case and having a robust classifier will allow researchers to test different compounds for their abilities to reduce the effects of treatment with Amyloid-β

An example of the visualisation produced by the code in this repository: Here Red indicates "Treated" and Blue indicates "Untreated". Regions are coloured in Green if the model hasn't produced a confident enough prediction for either class.^[1]

Installation

To clone this repository run:

git clone https://github.com/wfbstone/Neuron-Image-Classification.git
cd Neuron-Image-Classification

To install the requirements, run:

pip install -r requirements.txt

This project was written using iPython 3 on a Kaggle kernel. The original kernels can be found here.

Any suggestions for improvement are greatly appreciated and I encourage the use and adaptation of my code for use on other projects. Unfortunately however, the dataset cannot be made public at this time so running the code on the Neuron images is not currently possible.

The Model

The model used in classification is the VGG19 with weights pre-trained on the ImageNet dataset as provided by Keras but with 3 Fully-Connected layers of sizes 2048, 2048 and 2 on top of the Convolutional layers. The model was fine-tuned for the task of classifying the Neurons on a training set of 632 images with data augmentation including random cropping and random flipping both horizontally and vertically. After 60 epochs through the training st (15 epochs through each possibility of random flipping) the model performed extremely well on the unaugmented test set - achieving an F1 score of 0.96.

To verify that this performance isn't due to random artifacts in the data, both Grad-CAM and a Saliency Map were implemented.

Visualisation

For visualising the effects of treatment with Amyloid-β neither Grad-CAM nor the Saliency Map are particularly insightful. Grad-CAM is great for highlighting regions of interest but tends to highlight the entire network of Neurons and the Saliency Map is great at highlighting individual Nuclei and Neurites.

To produce an effect mapping, the Sliding Windows algorithm is implemented convolutionally as mentioned in the OverFeat paper. The Pooling layers within the VGG19 mean that this technique doesn't give as fine-grained an output as the full Sliding-Windows algorithm but given the fast run time it works fine. This technique is better at distinguishing which regions look more "Treated" or "Untreated" but not the individual Neurons.

Combining this technique with the Saliency Map the produces a very useful visualisation of the effect of treatment with Amyloid-β. Colouring the Saliency Map depending on which class the Sliding Windows algorithm deemed that region to be allows the visualisation of which Neurites appear to look treated by the compound and which don't.

This map also includes the option of "Unsure" and to include this, Youden's J Statistic is computed from the ROC Curve and the optimal threshold is the one which maximises the J Statistic.

Extensions

Improve robustness of sliding windows classification to reduce number of contradictory results. e.g.
Experiment with pooling and padding the outputted sliding windows map: this map doesn't directly correlate with which regions should be classified as what (there is overlap etc.)

kormilitzin / Neuron-Image-Classification