Knowledge Distillation for Skin Lesion Classification
The goal of knowledge distillation is to improve the performance of the half-witted model, which, most of the time, has fewer parameters, by allowing it to learn from the more competent model or the teacher model. The half-witted model, or the student model, excerpts the knowledge from the teacher model by matching its class distribution to the teacher model's. To make the distributions softer (used in the training process as part of the loss function), we can adjust a temperature T to them (this is done by dividing the logits before softmax by the temperature). This project designates EfficientNet-B0 as the teacher and SqueezeNet v1.1 as the student. These models will be experimented on the DermaMNIST dataset of MedMNIST. We will take a look at the performance of the teacher, the student (without knowledge distillation), and the student (with knowledge distillation) in the result section.
Experiment
To witness the distillation in action, please refer to the notebook at the following link.
Result
Quantitative Result
The quantitative results are delivered below in the form of a table.
Model | Loss | Accuracy |
---|---|---|
Teacher | 2.271 | 71.99% |
Student | 2.025 | 70.77% |
Distilled | 7.409 | 71.17% |
Accuracy and Loss Curve
Teacher
The loss curve on the train set and the validation set of the teacher model.
The accuracy curve on the train set and the validation set of the teacher model.
Student
The loss curve on the train set and the validation set of the student model.
The accuracy curve on the train set and the validation set of the student model.
Distilled
The loss curve on the train set and the validation set of the distilled model.
The accuracy curve on the train set and the validation set of the distilled model.
Overall Validation Curve
Comparison of loss curves between the teacher model, the student model, and the distilled model on the validation set.
Comparison of loss curves between the teacher model, the student model, and the distilled model on the validation set.
Qualitative Result
The qualitative results of the models on the test set are exhibited in the collated form below.
Teacher
The qualitative result of the teacher model.
Student
The qualitative result of the student model.
Distilled
The qualitative result of the distilled model.