CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos

We release the code of our paper CoTeRe-Net. (ECCV 2020 Spotlight)

@inproceedings{shi2020cotere,
  title={CoTeRe-Net: Discovering Collaborative Ternary Relations in Videos},
  author={Shi, Zhensheng and Guan, Cheng and Cao, Liangjie and Li, Qianqian and Liang, Ju and Gu, Zhaorui and Zheng, Haiyong and Zheng, Bing},
  booktitle={European Conference on Computer Vision},
  pages={379--396},
  year={2020},
  organization={Springer}
}

Introduction

CoTeRe-Net is a novel relation model that discovers relations of both implicit and explicit cues as well as their collaboration in videos. It concerns Collaborative Ternary Relations (CoTeRe), where the ternary relation involves channel (C, for implicit), temporal (T, for implicit), and spatial (S, for explicit) relation (R).

This code is based on the PySlowFast codebase. The core implementation for CoTeRe-Net are lib/models/cotere_builder.py, lib/models/ctsr_helper.py. We devise a flexible and effective CTSR module to collaborate ternary relations for 3D-CNNs, and then construct CoTeRe-Nets for action recognition.

Requirements

Python >= 3.6
Numpy
PyTorch 1.3
fvcore: pip install 'git+https://github.com/facebookresearch/fvcore'
torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this.
simplejson: pip install simplejson
GCC >= 4.9
PyAV: conda install av -c conda-forge
ffmpeg (4.0 is prefereed, will be installed along with PyAV)
PyYaml: (will be installed along with fvcore)
tqdm: (will be installed along with fvcore)
iopath: pip install -U iopath or conda install -c iopath iopath
psutil: pip install psutil
OpenCV: pip install opencv-python
torchvision: pip install torchvision or conda install torchvision -c pytorch

Datasets

Something-Something V1

Download the dataset and annotations from dataset provider.
Download the frame list from the following links: (train, val).
Add prefix "folder_0" and rename all frame files, for example: 1/00001.jpg => 1/1_000001.jpg, 999/00001.jpg => 999/999_000001.jpg
Put all annotation json files and the frame lists in the same folder, and set DATA.PATH_TO_DATA_DIR to the path. Set DATA.PATH_PREFIX to be the path to the folder containing extracted frames.

Running

To train and test a CoTeRe-ResNet-18 model from scratch on Something-Something V1. You can build variant CoTeRe-Nets via setting COTERE.TYPE.
```
python tools/run_net.py \
  --cfg configs/SSv1/R3D_18_COTERE_32x1.yaml \
  DATA.PATH_TO_DATA_DIR path_to_frame_list \
  DATA.PATH_PREFIX path_to_frames \
  COTERE.TYPE CTSR
```
You can also set the variables (DATA_PATH, FRAME_PATH, COTERE_TYPE) in scripts/run_ssv1_r3d_18_32x1.sh, and then run the script.
```
bash scripts/run_ssv1_r3d_18_32x1.sh
```

Models

We will provide the models and results later.

Acknowledgement

We really appreciate the contributors of following codebases.

zhenglab / cotere-net