Jimut123 / cellseg

Code for the paper titled "Advancing instance segmentation and WBC classification in peripheral blood smear through domain adaptation: A study on PBC and the novel RV-PBS datasets" published on Elsevier's Expert Systems With Applications (ESWA) journal.

Home Page:https://www.sciencedirect.com/science/article/pii/S0957417424005268

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Refactoring of Code, Proper Folder Structure and Add new codes uptill progress

Jimut123 opened this issue · comments

@heraldofsolace Look into this matter please...

Also need to build slides, deadline - 14 th March. Need to discuss about that soon...

@heraldofsolace Probably suggest a good name for this repo xD

Hey @heraldofsolace I forgot about this task... is this done?
https://colab.research.google.com/drive/16LMsoN0y__pAEVcrqzIb9bJ8y2Fogv7_?usp=sharing

Problem: Classification Model with the following specifications

Task:

  • Use data generator from Keras to do data augmentation with a batch size of 24, probably in Google Colab for now.
  • Make sure to find/ check the data distribution (using histograms from matplotlib) first, and if there is any class imbalance and stuff, then we have to take necessary actions, and ask for help.
  • Split the data to 80-10-10 (80% training, 10% validation and 10% test).
  • Record Precision, Recall, Accuracy, Loss, F1 score on datasets, i.e., training, validation and test sets. This can be done using scikit learn.
  • Plot all the graphs obtained and display the same.
  • Make a confusion matrix plot it using Matplotlib

TODO

  • K-Fold cross validation

Working on it but I feel dumb lol

Lol Tamal Mj's Laptop's CUDA just messed up for some reason, I need to fix that tomorrow... Let's see. Studying few papers today.

In the set "ig" has 4 different types of things "IG", "MMY", "MY", "PMY" and Neutrphil has 3 different types "BNE", "SNE" and "Neutrophil"

I have seen that before, thought that 8 classes will be good, but eventually we need them too. What would be better? Having a classifier to screen the first 8 classes first and then from that we will use different screening techniques? Or what do you suggest?

This is the cell glossary we will get from the slides

Blast (Bl)
Promyelocyte (PM)
Myelocyte (My)
Metamyelocyte (Me)
Band form  (Band) (Not a cell)
Neutrophil (N)
Eosinophil (Eo)
Basophil (Ba)
Lymphocyte (L)
Monocyte (Mo)
Nucleated RBC (NRBC)

I was thinking, maybe take them all as separate classes and separately augment them until they reach 2000 samples or something and combine into one dataset.

So let's do that then. What techniques will you use?

  • color jittering
  • flips
  • zoom
  • blurring(?)
  • maybe noise (?)
  • maybe (small) random crop of some image into a particular image
  • rotation
  • affine (?)

I was even thinking of doing an ensemble with polar transformation and FFT, but that will be too much for now

Let's first write the baseline code and then we'll add more augmentation as needed.

https://colab.research.google.com/drive/16LMsoN0y__pAEVcrqzIb9bJ8y2Fogv7_?usp=sharing
A very simple skeleton is here, but it needs to be modified a lot

So, the generator doesn't actually balance the classes. Using the generator we can't control the number of samples in classes. So we have two options -

  1. Accept the imbalance and set the model weight accordingly instead with augmentation.
  2. Use a dummy training run with augmentation to generate augmented image and create a larger dataset.

I think the 2nd approach would not be good as there is huge imbalance and some of the classes would just be full of slightly different images.

Aniket, If you have some scripts to run related to the project (say 100 epochs), you can give me.

Why? LOL

def Model_V2_Gradcam(H,W,C):

    input_layer = tf.keras.Input(shape=(H, W, C))
    x_1 = tf.keras.layers.Conv2D(16, 3, activation='relu', strides=(1, 1), name="conv_16_1", padding='same', kernel_initializer = 'he_normal', kernel_regularizer=l2(1e-4))(input_layer)
    x_2 = tf.keras.layers.Conv2D(16, 3, activation='relu', strides=(1, 1), name="conv_16_2", padding='same', kernel_initializer = 'he_normal', kernel_regularizer=l2(1e-4))(x_1)
    # x_4 = tf.keras.layers.Conv2D(16, 3, activation='relu', strides=(1, 1), name="conv_64_21", padding='same')(add([x_3,x_1]))
    x_3 = tf.keras.layers.MaxPooling2D((2, 2), name="max_pool3")(x_2)
    x_4 = tf.keras.layers.Conv2D(32, 3, activation='relu', strides=(1, 1), name="conv_32_1", padding='same', kernel_initializer = 'he_normal', kernel_regularizer=l2(1e-4))(x_3)
    x_5 = tf.keras.layers.Conv2D(32, 3, activation='relu', strides=(1, 1), name="conv_32_2", padding='same', kernel_initializer = 'he_normal', kernel_regularizer=l2(1e-4))(x_4)

    x_6 = tf.keras.layers.MaxPooling2D((2, 2), name="max_pool4")(x_5)
    x_7 = tf.keras.layers.Conv2D(64, 3, activation='relu', strides=(1, 1), name="conv_64_1", padding='same', kernel_initializer = 'he_normal', kernel_regularizer=l2(1e-4))(x_6)
    x_8 = tf.keras.layers.Conv2D(64, 3, activation='relu', strides=(1, 1), name="conv_64_2", padding='same', kernel_initializer = 'he_normal', kernel_regularizer=l2(1e-4))(x_7)
    x = tf.keras.layers.MaxPooling2D((2, 2), name="max_pool5")(x_8)
    x = tf.keras.layers.Conv2D(64, 3, activation='relu', strides=(2, 2), name="conv_64_3", kernel_initializer = 'he_normal', kernel_regularizer=l2(1e-4))(x)
    x = tf.keras.layers.MaxPooling2D((2, 2), name="max_pool6")(x)
    x = tf.keras.layers.Flatten(name="flatten")(x)
    x = tf.keras.layers.Dropout(0.15, name="dropout_3")(x)
    x = tf.keras.layers.Dense(256, activation='relu', name="dense_64")(x)
    x = tf.keras.layers.Dense(N_LABELS, activation='softmax', name="output_layer")(x)

    model = tf.keras.models.Model(inputs=input_layer, outputs=x)
    return model

model = Model_V2_Gradcam(H=360, W=360, C=3)

model.compile(optimizer='adam', loss='categorical_crossentropy',
            metrics= ['accuracy'])
model.summary()

I told you it needs multiple screening models, one model needs to be too deep to understand the minor variations between the classes, so ends up doing nothing

Why? LOL

Well, it's your model

Oh, so it won't obey you 🌚

Why? LOL

Well, it's your model

Looks like the primitive one should work better. Not tested though.

def Model_V1_Gradcam(H,W,C):
    input_layer = tf.keras.Input(shape=(H, W, C))
    x = tf.keras.layers.Conv2D(32, 3, activation='relu', strides=(2, 2), name="conv_32")(input_layer)
    x = tf.keras.layers.MaxPooling2D((2, 2), name="max_pool1")(x)
    x = tf.keras.layers.Conv2D(64, 3, activation='relu', strides=(2, 2), name="conv_64")(x)
    x = tf.keras.layers.MaxPooling2D((2, 2), name="max_pool2")(x)
    x = tf.keras.layers.Conv2D(64, 3, activation='relu', strides=(2, 2), name="conv_64_2")(x)
    x = tf.keras.layers.MaxPooling2D((2, 2), name="max_pool3")(x)
    
    x = tf.keras.layers.Flatten(name="flatten")(x)
    x = tf.keras.layers.Dense(512, activation='relu', name="dense_512")(x)
    x = tf.keras.layers.Dropout(0.5, name="dropout_1")(x)
    x = tf.keras.layers.Dense(512, activation='relu', name="dense_256")(x)
    x = tf.keras.layers.Dropout(0.5, name="dropout_2")(x)
    x = tf.keras.layers.Dense(128, activation='relu', name="dense_64")(x)
    x = tf.keras.layers.Dropout(0.5, name="dropout_3")(x)
    
    x = tf.keras.layers.Dense(N_LABELS, activation='softmax', name="output_layer")(x)
    #x = tf.keras.layers.Reshape((1, N_LABELS))(x)
    
    model = tf.keras.models.Model(inputs=input_layer, outputs=x)
    return model

model = Model_V1_Gradcam(H=360, W=360, C=3)

model.compile(optimizer='adam', loss='categorical_crossentropy',
            metrics= ['accuracy'])
model.summary()