AKASH2907 / pi-consistency-activity-detection

End-to-End Semi-Supervised Learning for Video Action Detection [CVPR 2022]

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This is the official implementation of our work End-to-End Semi-Supervised Learning for Video Action Detection at CVPR'22. Paper

framework

Train instructions

This is the command line argument to run the code respectively for variance and gradient maps:

python main.py --epochs 100 --bs 8 --loc_loss dice --lr 1e-4\
 --pkl_file_label train_annots_20_labeled.pkl\
 --pkl_file_unlabel train_annots_80_unlabeled.pkl\
 --wt_loc 1 --wt_cls 1 --wt_cons 0.1\
 --const_loss l2\
 --bv --n_frames 5 --thresh_epoch 11\
 --exp_id cyclic_variance_maps
python main.py --epochs 100 --bs 8 --loc_loss dice --lr 1e-4\
 --pkl_file_label train_annots_20_labeled.pkl\
 --pkl_file_unlabel train_annots_80_unlabeled.pkl\
 --wt_loc 1 --wt_cls 1 --wt_cons 0.1\
 --const_loss l2\
 --gv\
 --exp_id gradient_maps

Parameters explanation:

  • bv - Temporal Variance Attentive Mask
  • gv - Gradient Smoothness Attentive Mask
  • wt_loc - Weight for localization loss
  • wt_cls - Weight for classification loss
  • wt_cons - Weight for consistency loss
  • exp_id - Experiment id to set the folder name for saving checkpoints
  • pkl_file_label - Labeled subset
  • pkl_file_unlabel - Unlabeled subset

Evaluation

python evaluate.py --ckpt exp_id_folder

Pre-trained weights

Link to download I3D pre-trained weights:

https://github.com/piergiaj/pytorch-i3d/tree/master/models

We have used rgb_charades.pt for our experiments.

Datasets

UCF101-24 splits: Pickle files

JHMDB-21 splits: Text files

Set data path for UCF101 videos in ucf_dataloader.py inside datasets.

Results

main results

Citation

If you find this work useful, please consider citing the following paper:

@InProceedings{Kumar_2022_CVPR,
    author    = {Kumar, Akash and Rawat, Yogesh Singh},
    title     = {End-to-End Semi-Supervised Learning for Video Action Detection},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {14700-14710}
}

About

End-to-End Semi-Supervised Learning for Video Action Detection [CVPR 2022]

License:MIT License


Languages

Language:Python 93.1%Language:HTML 6.9%