xdshang/VidVRD-helper

dataset evaluation video-analysis visual-relationship-detection object-detection action-recognition spatio-temporal

Video Visual Relation Detection Helpler

This repository contains some helper functions for the convenient usage of ImageNet-VidVRD dataset and VidOR dataset. It also contains scripts for evaluating several relevant tasks, i.e. video object detection, action detection and visual relation detection.

Please note that the enclosed baseline only works for ImageNet-VidVRD. A generalized and improved baseline can be found at here.

Please cite the following papers if the datasets help your research:

@inproceedings{shang2017video,
    author={Shang, Xindi and Ren, Tongwei and Guo, Jingfan and Zhang, Hanwang and Chua, Tat-Seng},
    title={Video Visual Relation Detection},
    booktitle={ACM International Conference on Multimedia},
    address={Mountain View, CA USA},
    month={October},
    year={2017}
}

@inproceedings{shang2019annotating,
    author={Shang, Xindi and Di, Donglin and Xiao, Junbin and Cao, Yu and Yang, Xun and Chua, Tat-Seng},
    title={Annotating Objects and Relations in User-Generated Videos},
    booktitle={ACM International Conference on Multimedia Retrieval},
    address={Ottawa, ON, Canada},
    month={June},
    year={2019}
}

About

To keep updates with VRU Grand Challenge, please use https://github.com/NExTplusplus/VidVRD-helper

dataset evaluation video-analysis visual-relationship-detection object-detection action-recognition spatio-temporal

MIT License

Languages

Language:Python 100.0%