There are 28 repositories under video-understanding topic.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
A curated list of action recognition and related area resources
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
An open-source toolbox for action understanding based on PyTorch
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.
Temporal Segment Networks (TSN) in PyTorch
Video Foundation Models & Data for Multimodal Understanding
awesome grounding: A curated list of research papers in visual grounding
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
temporal action detection with SSN
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
A collection of recent video understanding datasets, under construction!
Temporal Segments LSTM and Temporal-Inception for Activity Recognition
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Official code for MiniGPT4-video
Tools for movie and video research
Dataset, code and model for the CVPR'20 paper "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction". And for the ECCV'20 SimAug paper.
ActionVLAD for video action classification (CVPR 2017)
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, video representation learning and temporal detection.
The 2nd place Solution to the Youtube-8M Video Understanding Challenge by Team Monkeytyping (based on tensorflow)
Pytorch Implementation of "Object level Visual Reasoning in Videos", F. Baradel, N. Neverova, C. Wolf, J. Mille, G. Mori , ECCV 2018
Paper list of activity prediction and related area
SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap (CVPR24 - CVSports workshop)
[CVPR 2020] Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation (PyTorch)