CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos
We release the code of our paper CoTeRe-Net. (ECCV 2020 Spotlight)
@inproceedings{shi2020cotere,
title={CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos},
author={Shi, Zhensheng and Guan, Cheng and Cao, Liangjie and Li, Qianqian and Liang, Ju and Gu, Zhaorui and Zheng, Haiyong and Zheng, Bing},
booktitle={European Conference on Computer Vision},
pages={379--396},
year={2020},
organization={Springer}
}
Introduction
CoTeRe-Net is a novel relation model that discovers relations of both implicit and explicit cues as well as their collaboration in videos. It concerns Collaborative Ternary Relations (CoTeRe), where the ternary relation involves channel (C, for implicit), temporal (T, for implicit), and spatial (S, for explicit) relation (R).
This code is based on the PySlowFast codebase. The core implementation for CoTeRe-Net are lib/models/cotere_builder.py, lib/models/ctsr_helper.py. We devise a flexible and effective CTSR module to collaborate ternary relations for 3D-CNNs, and then construct CoTeRe-Nets for action recognition.
Requirements
- Python >= 3.6
- Numpy
- PyTorch 1.3
- fvcore:
pip install 'git+https://github.com/facebookresearch/fvcore'
- torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this.
- simplejson:
pip install simplejson
- GCC >= 4.9
- PyAV:
conda install av -c conda-forge
- ffmpeg (4.0 is prefereed, will be installed along with PyAV)
- PyYaml: (will be installed along with fvcore)
- tqdm: (will be installed along with fvcore)
- iopath:
pip install -U iopath
orconda install -c iopath iopath
- psutil:
pip install psutil
- OpenCV:
pip install opencv-python
- torchvision:
pip install torchvision
orconda install torchvision -c pytorch
Datasets
Something-Something V1
- Download the dataset and annotations from dataset provider.
- Download the frame list from the following links: (train, val).
- Add prefix "folder_0" and rename all frame files, for example: 1/00001.jpg => 1/1_000001.jpg, 999/00001.jpg => 999/999_000001.jpg
- Put all annotation json files and the frame lists in the same folder, and set
DATA.PATH_TO_DATA_DIR
to the path. SetDATA.PATH_PREFIX
to be the path to the folder containing extracted frames.
Running
-
To train and test a CoTeRe-ResNet-18 model from scratch on Something-Something V1. You can build variant CoTeRe-Nets via setting COTERE.TYPE.
python tools/run_net.py \ --cfg configs/SSv1/R3D_18_COTERE_32x1.yaml \ DATA.PATH_TO_DATA_DIR path_to_frame_list \ DATA.PATH_PREFIX path_to_frames \ COTERE.TYPE CTSR
You can also set the variables (DATA_PATH, FRAME_PATH, COTERE_TYPE) in scripts/run_ssv1_r3d_18_32x1.sh, and then run the script.
bash scripts/run_ssv1_r3d_18_32x1.sh
Models
We will provide the models and results later.
Acknowledgement
We really appreciate the contributors of following codebases.