Multimedia Computing Group, Nanjing University (MCG-NJU)

Multimedia Computing Group, Nanjing University

MCG-NJU

Geek Repo

Location:Nanjing

Home Page:mcg.nju.edu.cn

Github PK Tool:Github PK Tool

Multimedia Computing Group, Nanjing University's repositories

VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Language:PythonLicense:NOASSERTIONStargazers:1306Issues:16Issues:119

MixFormer

[CVPR 2022 Oral & TPAMI 2024] MixFormer: End-to-End Tracking with Iterative Mixed Attention

Language:PythonLicense:MITStargazers:445Issues:7Issues:105

SparseBEV

[ICCV 2023] SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos

Language:PythonLicense:MITStargazers:324Issues:9Issues:80

CamLiFlow

[CVPR 2022 Oral & TPAMI 2023] Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion

SparseOcc

[ECCV 2024] Fully Sparse 3D Occupancy Prediction & RayIoU Evaluation Metric

Language:PythonLicense:Apache-2.0Stargazers:196Issues:5Issues:38

MeMOTR

[ICCV 2023] MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking

Language:PythonLicense:MITStargazers:137Issues:5Issues:19

MixFormerV2

[NeurIPS 2023] MixFormerV2: Efficient Fully Transformer Tracking

Language:PythonLicense:MITStargazers:134Issues:10Issues:39

LinK

[CVPR 2023] LinK: Linear Kernel for LiDAR-based 3D Perception

Language:PythonLicense:MITStargazers:81Issues:7Issues:7

MOTIP

Multiple Object Tracking as ID Prediction

Language:PythonLicense:Apache-2.0Stargazers:72Issues:6Issues:24

SGM-VFI

[CVPR 2024] Sparse Global Matching for Video Frame Interpolation with Large Motion

BIVDiff

[CVPR 2024] BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

PointTAD

[NeurIPS 2022] PointTAD: Multi-Label Temporal Action Detection with Learnable Query Points

Language:PythonLicense:Apache-2.0Stargazers:37Issues:4Issues:6

CoMAE

[AAAI 2023 Oral] CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets

VFIMamba

VFIMamba: Video Frame Interpolation with State Space Models

Language:PythonLicense:Apache-2.0Stargazers:27Issues:1Issues:0

DEQDet

[ICCV 2023] Deep Equilibrium Object Detection

Language:Jupyter NotebookStargazers:21Issues:3Issues:2

EVAD

[ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement

Language:PythonLicense:NOASSERTIONStargazers:20Issues:2Issues:4

MGMAE

[ICCV 2023] MGMAE: Motion Guided Masking for Video Masked Autoencoding

Language:PythonLicense:MITStargazers:19Issues:2Issues:2

SPLAM

[ECCV 2024 Oral] SPLAM: Accelerating Image Generation with Sub-path Linear Approximation Model

Language:PythonLicense:MITStargazers:13Issues:0Issues:0

SportsHHI

[CVPR 2024] SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos

Language:PythonStargazers:11Issues:0Issues:0

ZeroI2V

[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video

Language:PythonLicense:Apache-2.0Stargazers:11Issues:0Issues:0

AMD

[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models

Dynamic-MDETR

[TPAMI 2024] Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding

Language:PythonStargazers:10Issues:0Issues:0

StageInteractor

[ICCV 2023] StageInteractor: Query-based Object Detector with Cross-stage Interaction

Language:PythonLicense:Apache-2.0Stargazers:9Issues:2Issues:0

VLG

VLG: General Video Recognition with Web Textual Knowledge (https://arxiv.org/abs/2212.01638)

Language:PythonStargazers:8Issues:1Issues:0

DGN

[IJCV 2023] Dual Graph Networks for Pose Estimation in Crowded Scenes

Language:PythonStargazers:7Issues:2Issues:0

ViT-TAD

[CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos

Language:PythonStargazers:7Issues:0Issues:0

VideoEval

VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model

Language:PythonStargazers:6Issues:0Issues:0

PRVG

[CVIU 2024] End-to-end dense video grounding via parallel regression

Language:PythonStargazers:5Issues:0Issues:0

LogN

[IJCV 2024] Logit Normalization for Long-Tail Object Detection

Language:PythonLicense:Apache-2.0Stargazers:4Issues:1Issues:0

ProVP

[IJCV] Progressive Visual Prompt Learning with Contrastive Feature Re-formation

Language:PythonStargazers:3Issues:0Issues:0