dingfengshi / tridetplus

Code for the paper, Temporal Action Localization with Enhanced Instant Discriminability

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Temporal Action Localization with Enhanced Instant Discriminability

Overview

This repository contains the code for Temporal Action Localization with Enhanced Instant Discriminability paper. This code is extended upon the code of TriDet.

Installation

  1. Please ensure that you have installed PyTorch and CUDA. (This code requires PyTorch version >= 1.11. We use version=1.11.0 in our experiments)

  2. Install the required packages by running the following command:

pip install  -r requirements.txt
  1. Install NMS
cd ./libs/utils
python setup.py install --user
cd ../..

Data Preparation

The VideoMAEv2 feature on the HACS dataset

The pre-extracted features can be downloaded from this Link (Password: fqsy). They are extracted with window size 16 and stride 8.

Note: Due to the large number of videos, it takes about a week to use 12 V100 GPUs to extract the features. To reduce extraction time, we used the VideoMAEv2 with half-precision weights to extract some features (obtained float16 features). Please convert the features to float32 when using:

feats = np.load(feature_path).astype(np.float32)

The feature on the Charades dataset

The pre-extracted I3D features for Charades dataset can be downloaded from this Link (Password: ixuq). They are extracted with window size 16 and stride 4.

Note: We provide the json file for Charades and multithumos in the data folder.

References

If you find this work helpful, please consider citing our paper

@inproceedings{shi2023tridet,
  title={TriDet: Temporal Action Detection with Relative Boundary Modeling},
  author={Shi, Dingfeng and Zhong, Yujie and Cao, Qiong and Ma, Lin and Li, Jia and Tao, Dacheng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={18857--18866},
  year={2023}
}
@article{shi2023temporal,
  title={Temporal Action Localization with Enhanced Instant Discriminability},
  author={Shi, Dingfeng and Cao, Qiong and Zhong, Yujie and An, Shan and Cheng, Jian and Zhu, Haogang and Tao, Dacheng},
  journal={arXiv preprint arXiv:2309.05590},
  year={2023}
}

About

Code for the paper, Temporal Action Localization with Enhanced Instant Discriminability

License:MIT License


Languages

Language:Python 96.7%Language:C++ 2.0%Language:Shell 1.3%