Classifier: cells with free edges

The goal of this classifier is to identify cells with free edges in High Content image acquisitions. An example of cells (nuclei) located at edges of the monolayer is shown below (marked in red). In this folder you find the "data preparation" (r code), the code for "machine learning classifier" (MATLAB/octave). Find below a description of the pipeline

Figure 1. An example of cells (nuclei) located at edges of the monolayer is shown below (marked in red).

##. Below I show a visualization of the classified images. The color code indicates the probability of cells to be at the edge of the population. The classifier has an accurancy of 95.4%. See more images in the cross-validation section below

image acquisition used for cross validation

Figure 2. Image acquisition used for the classifier cross validation (nucle dna is shown)

An example of cells classified according to their position respect to the edge (color code refers to the probability)

Figure 3. Visualization of the results from the cross validation performed directly on the image acquisition. Color code indicates the probability of cells to be at the edge of the population

Data preparation 1 (show)

Clean and Visualize features and relative stats
Transform features: substitute values and normalization of absolute values
Shuffle and split the data-set in training and test sets
Save trainging and test data(cross validation will be performed directly on images)
r code: PIPELINE/1_DATA_PREPARATION1.rmd

Data preparation 2 (show)

Normalize distributions (e.g, log transformation)
Rescaling features
Save: transformed data set, mean and stdev of features to be used for testing and cross validation
r code: PIPELINE/2_DATA_PREPARATION2_NORM_RESCAL.rmd

In the sub-folder "1_DATA to 3_DATA" find the files with

original data
prepared data
shuffle and split data for training and test sets
tranformed data (log and rescaling)
mean and stdev features from transformed data

CLASSIFIER: TRAINING

MATLAB/octave files for the training is:

m3_TRAINING.m

functions for cost function, sigmoid and predictor are in:

3_LIB

input data files are in:

2_DATA and 3_DATA

output data files in output are in:

4_DATA

CLASSIFIER: TESTING

MATLAB/octave files for the testing is:

m3_TESTING.m

sigmoid and predictor functions are in:

3_LIB

input data files are in:

2_DATA and 3_DATA

output data files in output are in:

4_DATA

CLASSIFIER: CROSSVALIDATION

r code for the cross validation:

4_CROSS_VALIDATION.Rmd

input data files are in:

4_DATA

output data files in output are in:

5_DATA

Below I show various images and visualizations of the results from the cross validation performed directly on the image acquisition. The color code indicates the probability of cells to be at the edge of the population. The classifier has an accurancy of 95.4%

Cross validation 1 (3545 cells)

Figure 2. Image acquisition used for the classifier cross validation (nucle dna is shown)

Figure 3. Visualization of the results from the cross validation performed directly on the image acquisition. Color code indicates the probability of cells to be at the edge of the population

Figure 4. Different visualization of the results from the cross validation performed directly on the image acquisition. Color code indicates the probability of cells to be at the edge of the population

Cross validation 2 (7543 cells)

Figure 5. Image acquisition used for the classifier cross validation (nucle dna is shown)

Figure 5. Visualization of the results from the cross validation performed directly on the image acquisition. Color code indicates the probability of cells to be at the edge of the population

rempic / MACHINE-LEARNING-Edge-Cells-classifier

Classifier: cells with free edges

Data preparation 1 (show)

Data preparation 2 (show)

CLASSIFIER: TRAINING

CLASSIFIER: TESTING

CLASSIFIER: CROSSVALIDATION

Cross validation 1 (3545 cells)

Cross validation 2 (7543 cells)

About

Languages