deniskutnar / credanno

Code for reducing annotation need in self-explanatory models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DOI:10.1007/978-3-031-17976-1_4

cRedAnno 🤏

Considerably Reducing Annotation Need in Self-Explanatory Models


Method illustration

cRedAnno_plus_Intro_dark

Performance overview

  • Comparison between cRedAnno+ and cRedAnno:
anno_reduce
  • Prediction accuracy (%) of nodule attributes and malignancy:
Nodule attributes Malignancy
Sub Cal Sph Mar Lob Spi Tex
Full annotation
cRedAnno+ 96.32 ±0.61 95.88 ±0.15 97.23 ±0.20 96.23 ±0.23 93.93 ±0.87 94.06 ±0.60 97.01 ±0.26 87.56 ±0.61
Partial annotation
cRedAnno (10%) 96.06 ±2.02 93.76 ±0.85 95.97 ±0.69 94.37 ±0.79 93.06 ±0.27 93.15 ±0.33 95.49 ±0.85 86.65 ±1.39
cRedAnno+ (10%) 96.23 ±0.45 92.72 ±1.66 95.71 ±0.47 90.03 ±3.68 93.89 ±1.41 93.67 ±0.64 92.41 ±1.05 87.86 ±1.99
cRedAnno (1%) 93.98 ±2.09 89.68 ±3.52 94.02 ±2.30 91.94 ±1.17 91.03 ±1.72 90.81 ±1.56 93.63 ±0.47 80.02 ±8.56
cRedAnno+ (1%) 95.84 ±0.34 92.67 ±1.24 95.97 ±0.45 91.03 ±4.65 93.54 ±0.87 92.72 ±1.19 92.67 ±1.50 86.22 ±2.51

Usage instruction

Dependencies

Create an environment from the environment.yml file:

conda env create -f environment.yml

and install pylidc for dataset pre-processing.

Data pre-processing

Use extract_LIDC_IDRI_nodules.py to extract nodule slices.

Training

1. Unsupervised feature extraction

Following DINO, to train on the extracted nodules:

python -m torch.distributed.launch --nproc_per_node=2 main_dino.py --arch vit_small --data_path /path_to_extracted_dir/Image/train --output_dir ./logs/vits16_pretrain_full_2d_ann --epochs 300

The reported results start from the ImageNet-pretrained full weights provided for ViT-S/16, which should be put under ./logs/vits16_pretrain_full_2d_ann/.

2. Semi-supervised prediction

  • Sparse seeding:

    python eval_linear_joint_recycle.py --pretrained_weights ./logs/vits16_pretrain_full_2d_ann/checkpoint.pth --data_path /path_to_extracted_dir --output_dir ./logs/vits16_pretrain_full_2d_ann --label_frac 0.01 --lr 0.0005 --seed 42 --mode seed
  • Semi-supervised active learning:

    python eval_linear_joint_recycle.py --pretrained_weights ./logs/vits16_pretrain_full_2d_ann/checkpoint.pth --data_path /path_to_extracted_dir --output_dir ./logs/vits16_pretrain_full_2d_ann --label_frac 0.1 --lr 0.0005 --seed 42 --mode boost
v1 (click to expand)

To train the predictors:

python eval_linear_joint.py --pretrained_weights ./logs/vits16_pretrain_full_2d_ann/checkpoint.pth --data_path /path_to_extracted_dir --output_dir ./logs/vits16_pretrain_full_2d_ann --label_frac 0.01

or use the k-NN classifiers:

python eval_knn_joint.py --pretrained_weights ./logs/vits16_pretrain_full_2d_ann/checkpoint.pth --data_path /path_to_extracted_dir --output_dir ./logs/vits16_pretrain_full_2d_ann --label_frac 0.01

In both cases, --label_frac controls the used fraction of annotations.

The results are saved in pred_results_*.csv files under specified --output_dir.

Code reference

Our code adapts from DINO.

About

Code for reducing annotation need in self-explanatory models

License:Apache License 2.0


Languages

Language:Python 100.0%