Necolizer / Anomaly-Detection-Using-KolektorSDD-Dataset

Reorganize KolektorSDD dataset as MVTecAD dataset's format. Report SOTA anomaly detection models in KolektorSDD.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Anomaly-Detection-Using-KolektorSDD-Dataset

Reorganize KolektorSDD dataset as MVTecAD dataset's format. Report SOTA anomaly detection models' results in KolektorSDD.

0. Table of Contents

1. Purpose

zh

这是本仓库作者在实习期间完成的代码,主要内容是拿KolektorSDD数据集去跑部分SOTA工业缺陷检测模型(主要取自MVTecAD的排行榜)。由于各种限制,这里只公布了KolektorSDD的预处理代码和复现出的结果。通过KolektorSDD的预处理代码,可以快速将KolektorSDD数据集的格式转换为MVTecAD数据集的格式,这样就可以直接套用SOTA开源代码或者Anomalib库,快速的进行训练和测试。完整复现应该不困难。

en

This repository contains code for preprocessing KolektorSDD dataset so that we could train/test some SOTA anomaly detection models in MVTecAD leaderboard. Due to various restrictions, I do NOT upload the modified training/testing code for those SOTA models. But I believe with the KolektorSDD dataset after preprocessing, you could reproduce the results in a very short time, just with slight modifications to SOTA codes/Anomalib using MVTecAD configurations. I also report the results I reproduce for comparisons.

2. Usage

  1. Download KolektorSDD Dataset with fine annotations in https://www.vicos.si/resources/kolektorsdd/
    • Please cite according to their requirements
  2. Unzip the KolektorSDD file
  3. Git clone this repo
  4. Create a new conda virtual environment with anomalib_env.yaml
conda env create -f anomalib_env.yaml
  1. Modify the path in KolektorSDD_Preprocess.py
# Args that you need to change: 
# @ read_base : Path to the downloaded KolektorSDD dataset
# @ save_base : Path to the repository you wanna save
read_base = r'.\KolektorSDD'
save_base = r'.\KolektorSDD1'
  1. Run the script
python KolektorSDD_Preprocess.py
  1. Your final directory tree of reorganized KolektorSDD should look like this:
save_base
└── metal
    ├── ground_truth
    |    └── defect
    |         ├── 000_mask.png
    |         ├── 001_mask.png
    |         ├── ...
    ├── test
    |    ├── defect
    |    |    ├── 000.png
    |    |    ├── 001.png
    |    |    ├── ...
    |    └── good
    |         ├── 000.png
    |         ├── 001.png
    |         ├── ...
    └── train
         └── good
             ├── 000.png
             ├── 001.png
             ├── ...
  1. Modify the code configurations and run your training and testing scripts
    • See Section 4 to get the open source code for SOTA models in my experiment
    • With the KolektorSDD dataset after preprocessing, you could reproduce the results in a very short time, just with slight modifications to SOTA codes/Anomalib using MVTecAD configurations.

3. Illustrations of KolektorSDD Preprocessing

zh

总体思路:处理成接近MVTecAD数据集的样式

步骤:

  1. Jpg和Bmp转Png
  2. Resize到统一的尺寸:500x1240
  3. 划分训练和测试
    • 训练:295张正常
    • 测试:52张正常+52张异常

这里需要注意:

  • 本仓库作者采取的手段是直接Resize到统一的尺寸,这可能会导致某些小缺陷的mask变形,如果有时间,可以换成crop的形式,把有缺陷的部分crop出一个正方形区域出来。
  • 本仓库作者采取的划分方式是直接划分训练和测试,没有留验证集。同时,在代码中已经规定了测试时的正常和异常样本数量相等。如果需要。可以自行修改代码,取合理的划分。
  • 因为random.shuffle没有固定种子,每次运行会得到具体样本不同的划分结果。

en

To take use of SOTA codes / Anomalib using MVTecAD configurations, we should reorganize KolektorSDD dataset in MVTecAD dataset's format.

Steps:

  1. JPG/BMP to PNG
  2. Resize to the same size (500x1240)
  3. Train-Test Split
    • Train
      • Flawless samples for training: 295
    • Test
      • Flawless samples for testing: 52
      • Anomalies for testing: 52

ATTENTION:

  • I resize all the images to the same size (500x1240), which might result in defect distortions.
  • Because of the small number of samples, I do NOT reserve a validation set. I sampled the same number of flawless samples as anomalies for testing. This setting could be changed as you want.
  • Seed for random.shuffle() is NOT fixed. So each run of this preprocessing script will result in different splitting results.

4. Experimental Results

SOTA models that chosen to train/test (Also as Acknowledgements):

Version Methods Backbone Avg DET AUC (image ROCAUC) Avg SEG AUC (pixel ROCAUC) pixel PROAUC
Official Code PatchCore wide_resnet50_2 0.909 0.941 /
CFA wrn50_2 0.939 0.939 0.823
Cflow-AD wide_resnet50_2 0.801 0.891 0.497
Unofficial Code FastFlow cait_m48_448 0.955 0.960 /
Anomalib PatchCore wide_resnet50_2 0.863 0.840 /
FastFlow resnet18 0.807 0.883 /
cait_m48_448 0.914 0.963 /
Cflow-AD wide_resnet50_2 0.850 0.847 /

5. Sample Visualizations

  • PatchCore (wide_resnet50_2)

PatchCore

  • FastFlow (cait_m48_448)

FastFlow

  • Cflow-AD (wide_resnet50_2)

Cflow-AD

  • CFA (wrn50_2)
CFA

6. Change Log

  • [2022/12/29] Provide conda virtual environment dependency list in anomalib_env.yaml.
  • [2022/12/28] Create repository and release preprocessing script.

7. License

MIT

About

Reorganize KolektorSDD dataset as MVTecAD dataset's format. Report SOTA anomaly detection models in KolektorSDD.

License:MIT License


Languages

Language:Python 100.0%