CAMELYON+ BENCHMARK

INTRODUCTION

why we do this work?

Multiple Instance Learning (MIL) methods are mainstream approaches for pathological image classification and analysis. The CAMELYON-16/17 datasets are commonly used to evaluate MIL methods. However, they have the following issues:

CAMELYON-16/17 datasets contain some problematic slides
Pixel-annotations of CAMELYON-16/17 test-dataset not accurate enough
Different MIL methods do not have a unified dataset-split and evaluation-metrics on the CAMELYON dataset
To conclude,there is no BENCHMARK for MIL methods

what we do in this work?

We do the following work to establish a CAMELYON+ BENCHMARK

Remove some problematic slides.
Correct problematic annotations.
Merge the correct version of**CAMELYON-16/17** datasets as the CAMELYON+ dataset.
Evaluate mainstream MIL methods on the CAMELYON-NEW dataset.
Evaluate mainstream feature extractors on the CAMELYON-NEW dataset.
Use more comprehensive evaluation metrics to assess different methods.
In summary, we establish a new CAMELYON+ BENCHMARK.

CAMELYON+

balanced-dataset-split

We apply balanced-dataset-split
Details will be released soon

download

BASELINE

MEAN_MIL
MAX_MIL
AB_MIL Attention-based Deep Multiple Instance Learning (ICML 2018)
TRANS_MIL Transformer based Correlated Multiple Instance Learning for WSI Classification (NeurIPS 2021)
DS_MIL Dual-stream MIL Network for WSI Classification with Self-supervised Contrastive Learning (CVPR 2021)
CLAM_MIL Data Efficient and Weakly Supervised Computational Pathology on WSI (NAT BIOMED ENG 2021)
DTFD_MIL Double-Tier Feature Distillation MIL for Histopathology WSI Classification (CVPR 2022)
WIKG_MIL Dynamic Graph Representation with Knowledge-aware Attention for WSI Analysis (CVPR 2024)
AMD-MIL Agent Aggregator with Mask Denoise Mechanism for Histopathology Whole Slide Image Analysis (ACM-MM 2024)
FR-MIL Distribution Re-calibration based MIL with Transformer for WSI Classification (TMI 2024)

FEATURE-ENCODER

R50 Deep Residual Learning for Image Recognition (CVPR 2016)
VIT-S An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ICLR 2021)
PLIP A visual–language foundation model for pathology image analysis using medical Twitter (NAT MED 2023)
CONCH A visual-language foundation model for computational pathology (NAT MED 2024)
UNI Towards a general-purpose foundation model for computational pathology (NAT MED 2024)
GIG A whole-slide foundation model for digital pathology from real-world data (NAT 2024)

BENCHMARK RUSULTS

REFINE-CAMELYON-17 (4 classes)

VIT_S-METRICS

MIL	PARAMS	ACC	B-ACC	AUC	F1	PRE
MEAN	mobilenet	3.3M	34.02	10.56	60	60
MAX	mobilenet	3.3M	34.02	10.56	60	60
AB	mobilenet	3.3M	34.02	10.56	60	60
G-AB	mobilenet	3.3M	34.02	10.56	60	60
TRANS	mobilenet	3.3M	34.02	10.56	60	60
DS	mobilenet	3.3M	34.02	10.56	60	60
CLAM-SB	mobilenet	3.3M	34.02	10.56	60	60
CLAM_MB	mobilenet	3.3M	34.02	10.56	60	60
RRT	mobilenet	3.3M	34.02	10.56	60	60
WIKG	mobilenet	3.3M	34.02	10.56	60	60

PLIP-METRICS

MIL	PARAMS	ACC	B-ACC	AUC	F1	PRE
MEAN	mobilenet	3.3M	34.02	10.56	60	60
MAX	mobilenet	3.3M	34.02	10.56	60	60
AB	mobilenet	3.3M	34.02	10.56	60	60
G-AB	mobilenet	3.3M	34.02	10.56	60	60
TRANS	mobilenet	3.3M	34.02	10.56	60	60
DS	mobilenet	3.3M	34.02	10.56	60	60
CLAM-SB	mobilenet	3.3M	34.02	10.56	60	60
CLAM_MB	mobilenet	3.3M	34.02	10.56	60	60
RRT	mobilenet	3.3M	34.02	10.56	60	60
WIKG	mobilenet	3.3M	34.02	10.56	60	60

UNI-METRICS

MIL	PARAMS	ACC	B-ACC	AUC	F1	PRE
MEAN	mobilenet	3.3M	34.02	10.56	60	60
MAX	mobilenet	3.3M	34.02	10.56	60	60
AB	mobilenet	3.3M	34.02	10.56	60	60
G-AB	mobilenet	3.3M	34.02	10.56	60	60
TRANS	mobilenet	3.3M	34.02	10.56	60	60
DS	mobilenet	3.3M	34.02	10.56	60	60
CLAM-SB	mobilenet	3.3M	34.02	10.56	60	60
CLAM_MB	mobilenet	3.3M	34.02	10.56	60	60
RRT	mobilenet	3.3M	34.02	10.56	60	60
WIKG	mobilenet	3.3M	34.02	10.56	60	60

RESNET50-METRICS

MIL	PARAMS	ACC	B-ACC	AUC	F1	PRE
MEAN	mobilenet	3.3M	34.02	10.56	60	60
MAX	mobilenet	3.3M	34.02	10.56	60	60
AB	mobilenet	3.3M	34.02	10.56	60	60
G-AB	mobilenet	3.3M	34.02	10.56	60	60
TRANS	mobilenet	3.3M	34.02	10.56	60	60
DS	mobilenet	3.3M	34.02	10.56	60	60
CLAM-SB	mobilenet	3.3M	34.02	10.56	60	60
CLAM_MB	mobilenet	3.3M	34.02	10.56	60	60
RRT	mobilenet	3.3M	34.02	10.56	60	60
WIKG	mobilenet	3.3M	34.02	10.56	60	60

About

[Scientific Data] CAMELYON BENCHMARK : better datasets for MIL methods

MIT License