Issues with loss, accuracy, and sklearn metrics for multi-label classification (eye/retinal images)
yogeshriyat opened this issue · comments
Hello, I am having training issues with this model. I am attempting transfer learning on medical images (eyes/retinas). The training data has 2404 images and the validation data has 1031 images. I am using Google Colab Pro with GPUs. Here is my code:
augmentations
train_datagen = ImageDataGenerator(
rescale=1./255.,
rotation_range=360,
brightness_range=[0.5, 1.5],
zoom_range=[1, 1.2],
zca_whitening=True,
horizontal_flip=True,
vertical_flip=True,
fill_mode='constant'
)
initializing model
input_shape = (256, 256, 3)
dropout_rate = 0.2
number_of_classes = 3
conv_base = EfficientNetB7(weights=None, include_top=False, input_shape=input_shape)
learning rate schedule
initial_learning_rate=2e-6
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate,
decay_steps=100000,
decay_rate=0.96,
staircase=True
)
model
en_model = models.Sequential()
en_model.add(conv_base)
en_model.add(layers.GlobalMaxPooling2D(name='gap'))
# Avoid overfitting
en_model.add(layers.Dropout(rate=dropout_rate, name='dropout_out'))
# Set number_of_classes to the number of your final predictions
en_model.add(layers.Dense(number_of_classes, activation='sigmoid', name='fc_out')) #replaced softmax with sigmoid
conv_base.trainable = False
en_model.compile(
loss='binary_crossentropy',
optimizer=optimizers.Adam(learning_rate=lr_schedule),
metrics=['accuracy']
)
training
history = en_model.fit(
train_generator,
steps_per_epoch=10,
epochs=100,
validation_data=val_generator,
#validation_steps=None,
validation_freq=1,
verbose=1,
callbacks=[tensorboard_callbacks],
use_multiprocessing=True,
workers=4
)
print('Average test loss: ', np.average(history.history['loss']))
Why is the validation loss/accuracy a straight line? What could be wrong?
Now, the Sklearn metrics. You will see that the numbers repeat for all the metrics and the confusion matrix has zeros! What is going on?
Here are the results for one of the labels.
val_pred = EfficientNet_model_10.predict(val_generator, verbose=1)
y_true = val_generator.labels[:, 0]
y_pred = val_pred[:, 0]
y_pred = [round(x) for x in y_pred]
# y_true are the labels from the validation generator; we have three labels (DR, glaucoma, other)
print(f'Accuracy = {accuracy_score(y_true, y_pred)}')
print(f"F1 = {f1_score(y_true, y_pred, average='micro')}")
print(f"Precision = {precision_score(y_true, y_pred, average='micro')}")
print(f"Recall = {recall_score(y_true, y_pred, average='micro')}")
print('Confusion matrix =')
confusion_matrix(y_true, y_pred)
Output:
Accuracy = 0.7992240543161979
F1 = 0.7992240543161979
Precision = 0.7992240543161979
Recall = 0.7992240543161979
Confusion matrix =
array([[824, 0],
[207, 0]])
One thing that I noticed is that the ranges for the predictions are very narrow:
print(min(val_pred[:, 0:1]))
print(max(val_pred[:, 0:1]))
[0.49519995]
[0.49520218]
Any guidance would be highly appreciated!!