cRedAnno 🤏

Considerably Reducing Annotation Need in Self-Explanatory Models

v2 (cRedAnno+): [arXiv]
v1 (cRedAnno): [arXiv] [Poster] [Slides]
[Dataset]

Method illustration

Performance overview

Comparison between cRedAnno+ and cRedAnno:

Prediction accuracy (%) of nodule attributes and malignancy:

	Nodule attributes							Malignancy
	Sub	Cal	Sph	Mar	Lob	Spi	Tex	Malignancy
Full annotation
cRedAnno+	96.32 ±0.61	95.88 ±0.15	97.23 ±0.20	96.23 ±0.23	93.93 ±0.87	94.06 ±0.60	97.01 ±0.26	87.56 ±0.61
Partial annotation
cRedAnno (10%)	96.06 ±2.02	93.76 ±0.85	95.97 ±0.69	94.37 ±0.79	93.06 ±0.27	93.15 ±0.33	95.49 ±0.85	86.65 ±1.39
cRedAnno+ (10%)	96.23 ±0.45	92.72 ±1.66	95.71 ±0.47	90.03 ±3.68	93.89 ±1.41	93.67 ±0.64	92.41 ±1.05	87.86 ±1.99
cRedAnno (1%)	93.98 ±2.09	89.68 ±3.52	94.02 ±2.30	91.94 ±1.17	91.03 ±1.72	90.81 ±1.56	93.63 ±0.47	80.02 ±8.56
cRedAnno+ (1%)	95.84 ±0.34	92.67 ±1.24	95.97 ±0.45	91.03 ±4.65	93.54 ±0.87	92.72 ±1.19	92.67 ±1.50	86.22 ±2.51

Usage instruction

Dependencies

Create an environment from the environment.yml file:

conda env create -f environment.yml

and install pylidc for dataset pre-processing.

Data pre-processing

Use extract_LIDC_IDRI_nodules.py to extract nodule slices.

Training

1. Unsupervised feature extraction

Following DINO, to train on the extracted nodules:

python -m torch.distributed.launch --nproc_per_node=2 main_dino.py --arch vit_small --data_path /path_to_extracted_dir/Image/train --output_dir ./logs/vits16_pretrain_full_2d_ann --epochs 300

The reported results start from the ImageNet-pretrained full weights provided for ViT-S/16, which should be put under ./logs/vits16_pretrain_full_2d_ann/.

2. Semi-supervised prediction

Sparse seeding:

python eval_linear_joint_recycle.py --pretrained_weights ./logs/vits16_pretrain_full_2d_ann/checkpoint.pth --data_path /path_to_extracted_dir --output_dir ./logs/vits16_pretrain_full_2d_ann --label_frac 0.01 --lr 0.0005 --seed 42 --mode seed

Semi-supervised active learning:

python eval_linear_joint_recycle.py --pretrained_weights ./logs/vits16_pretrain_full_2d_ann/checkpoint.pth --data_path /path_to_extracted_dir --output_dir ./logs/vits16_pretrain_full_2d_ann --label_frac 0.1 --lr 0.0005 --seed 42 --mode boost

v1 (click to expand)

To train the predictors:

python eval_linear_joint.py --pretrained_weights ./logs/vits16_pretrain_full_2d_ann/checkpoint.pth --data_path /path_to_extracted_dir --output_dir ./logs/vits16_pretrain_full_2d_ann --label_frac 0.01

or use the k-NN classifiers:

python eval_knn_joint.py --pretrained_weights ./logs/vits16_pretrain_full_2d_ann/checkpoint.pth --data_path /path_to_extracted_dir --output_dir ./logs/vits16_pretrain_full_2d_ann --label_frac 0.01

In both cases, --label_frac controls the used fraction of annotations.

The results are saved in pred_results_*.csv files under specified --output_dir.

Code reference

Our code adapts from DINO.

About

Code for reducing annotation need in self-explanatory models

Apache License 2.0

Languages

Language:Python 100.0%