VideoLanguageFuturePrediction
VLEP dataset for video and language future event prediction.
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
Table of Contents
- VideoLanguageFuturePrediction
- Dataset
- Evaluation and CodaLab Submission
- Related work
- Citation
- Contact
VLEP Dataset
The dataset is released at data, please see data/README.md for details.
Evaluation and CodaLab Submission
We only release ground-truth answers for train and dev splits. To get results on the test split, please submit your results to our CodaLab evaluation server following the instructions here: standalone_eval/README.md.
Related Work
- TVC (Video+Dialogue Captioning)
- TVR (Video+Dialogue Retrieval)
- TVQA (Localized Video QA)
- TVQA+ (Spatio-Temporal Video QA)
- recurrent-transformer (coherent video paragraph captioning)
Citation
If you find this code useful for your research, please cite our paper:
@inproceedings{lei2020vlep,
title={What is More Likely to Happen Next? Video-and-Language Future Event Prediction},
author={Lei, Jie and Yu, Licheng and Berg, Tamara L and Bansal, Mohit},
booktitle={EMNLP},
year={2020}
}
Contact
Jie Lei, jielei@cs.unc.edu