Unsupervised Image Anomaly Detection with Denoised Heterogeneous Networks via Knowledge Distillation

The code will be published after the paper is accepted.

Overall Architecture

Abstract

Industrial anomaly detection methods based on knowledge distillation have made significant progress. However, their architecture remains constrained by pre-trained models, and the issue of limited generalization performance persists. To address these challenges, this paper proposes a Denoising Heterogeneous Knowledge Distillation (DHKD) algorithm for anomaly detection. Specifically, the teacher network employs invertible regularized flows for precise probability density modeling, while the student network uses a conventional feedforward neural network, which cannot fully replicate the teacher model's representation of anomalous images. This design significantly enhances the representation differences between the teacher and student networks, while also freeing the knowledge distillation model from the heavy reliance on pre-trained models.DHKD transforms the teacher-student network from a conventional encoder-decoder architecture into a generative architecture. To improve the student network's reconstruction capability, Gaussian noise is added at the feature level to simulate anomalies. The student network utilizes a dual-domain reconstruction module to filter out anomalous information, thereby enhancing the model's response to normal information and promoting high-quality reconstruction. Additionally, to enhance the student model's sensitivity to the overall context of images and its understanding of spatial relationships, a teacher-student representation affinity loss is employed. This not only improves the interaction and connection between different regions of the image but also enables the model to effectively integrate local features with global contextual information. On the MVTec dataset, our model achieved state-of-the-art performance with an AUROC of 99.3 for detection and 98.6 for localization. Moreover, visual results from multiple real-world datasets demonstrate that the proposed model has excellent generalization capabilities.

Problem statement

The goal of industrial image anomaly detection is to distinguish between normal and anomalous samples and to locate anomalies within the anomalous samples. Traditional knowledge distillation methods rely on the powerful feature extraction capabilities of pre-trained models, which enable the teacher model to rapidly guide the student model toward convergence. However, this reliance also limits the flexibility of the teacher network architecture and significantly hinders the development of knowledge distillation models in the field of anomaly detection. Additionally, another limitation of knowledge distillation is that the teacher and student networks often use similar or identical structures, which results in poor generalization performance of the model.

To address these issues, this paper proposes a flexibly designed heterogeneous teacher-student model to overcome the shortcomings of existing methods. The viewpoint upheld in this paper is that while teacher-student knowledge distillation fundamentally depends on significant differences between the teacher and student, the core issue is the prominent representational difference of the teacher network when faced with normal and anomalous samples. The objective of the student model is to provide a stable normal representation for both normal and anomalous samples. Therefore, these issues can be understood as a multi-objective optimization problem:

$$ MAX(Diff(NFTN(I),NFTN(I^\prime))) $$

$$ MIN(Diff(DSN(I),DSN(I^\prime))) $$

Results

category	Arnet	Draem	Different	Padim	Cflow	Cs-flow	Stpm	Rkd	Ours
Grid	88.3	99.9	84.0	97.3	99.6	99.0	100	100	99.8
Leather	86.2	100	97.1	99.2	100	99.9	100	100	100
Tile	73.5	99.6	99.4	94.1	99.9	100	95.5	99.3	99.3
carpet	70.6	97.8	92.9	99.1	98.7	100	98.9	98.9	99.0
Wood	92.3	99.1	99.8	94.9	99.1	100	99.2	99.3	100
Bottle	94.1	99.2	99.0	98.3	100	99.8	100	100	99.7
Capsule	68.1	98.5	86.9	98.5	97.7	97.1	88.0	96.3	98.2
Pill	78.6	98.9	88.8	95.7	96.8	98.6	93.8	96.6	98.6
Transistor	84.3	93.1	91.1	97.5	95.2	99.3	93.7	96.7	99.3
Zipper	87.6	100	95.1	98.5	98.5	99.7	93.6	98.5	98.9
Cable	83.2	91.8	95.9	96.7	97.6	99.1	92.3	95.0	98.7
Hazelnut	85.5	100	99.3	98.2	100	99.6	100	99.9	100
Mental nut	66.7	98.7	96.1	97.2	99.3	99.1	100	100	100
Screw	100	93.9	96.3	98.5	91.9	97.6	88.2	97.0	98.8
Toothbrush	100	100	98.6	98.8	99.7	91.9	87.8	99.5	99.6
Average	83.9	98.0	94.7	97.5	98.3	98.7	95.4	98.2	99.3

category	mkd	Spade	Padim	riad	Cutpaste	Ikd	Rkd	Ours
Grid	91.8	93.7	97.3	99.8	97.5	97.0	99.3	98.1
Leather	98.1	97.6	99.2	94.4	99.5	98.5	99.4	99.9
Tile	82.8	87.4	94.1	89.1	90.5	95.7	95.6	97.4
carpet	95.6	97.5	99.1	96.3	98.3	98.7	98.9	98.3
Wood	84.8	88.5	94.9	85.8	95.5	93.9	95.3	97.5
Bottle	96.3	98.4	98.3	98.4	97.6	98.9	98.7	99.1
Capsule	95.9	99.0	98.5	92.8	92.8	98.5	98.7	98.2
Pill	89.6	96.5	95.7	95.7	95.8	98.8	98.2	97.9
Transistor	76.5	94.1	97.5	87.7	95.5	97.1	92.5	99.3
Zipper	93.9	96.5	98.5	97.8	99.3	97.6	98.2	99.2
Cable	82.4	97.2	96.7	84.2	84.2	98.0	97.4	98.9
Hazelnut	94.6	99.1	98.2	96.1	99.6	98.7	99.6	99.5
Mental nut	86.4	98.1	97.2	92.5	93.1	98.3	97.3	99.5
Screw	96.0	98.9	98.5	98.8	96.7	98.6	99.6	98.4
Toothbrush	96.1	97.9	98.8	99.7	98.1	98.6	99.1	98.9
Average	90.7	97.5	97.5	94.2	96.0	97.8	97.8	98.6

Visualization

Citation

@article{tong2023two,
  title={Two-stage reverse knowledge distillation incorporated and Self-Supervised Masking strategy for industrial anomaly detection},
  author={Tong, Guoxiang and Li, Quanquan and Song, Yan},
  journal={Knowledge-Based Systems},
  volume={273},
  pages={110611},
  year={2023},
  publisher={Elsevier}
}
@article{tong2024enhanced,
  title={Enhanced multi-scale features mutual mapping fusion based on reverse knowledge distillation for industrial anomaly detection and localization},
  author={Tong, Guoxiang and Li, Quanquan and Song, Yan},
  journal={IEEE Transactions on Big Data},
  year={2024},
  publisher={IEEE}
}

License

This project is licensed under the Apache-2.0 License.

nicebro123 / DHKD