simon3dv

simon3dv

Geek Repo

Company:sensetime

Location:shanghai

Home Page:http://simon3dv.github.io/

Github PK Tool:Github PK Tool

simon3dv's starred repositories

VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Language:PythonLicense:CC-BY-4.0Stargazers:148Issues:0Issues:0

VTG-LLM

[Preprint] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Language:PythonLicense:Apache-2.0Stargazers:32Issues:0Issues:0

UVCOM

[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

Language:PythonLicense:MITStargazers:56Issues:0Issues:0

Awesome-Temporal-Action-Detection-Temporal-Action-Proposal-Generation

Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation

Stargazers:406Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:121Issues:0Issues:0

CVPR2024-Papers-with-Code

CVPR 2024 论文和开源项目合集

Stargazers:17268Issues:0Issues:0

VTimeLLM

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Language:PythonLicense:NOASSERTIONStargazers:180Issues:0Issues:0

edit-one-for-all

✏️ Edit One for All: Interactive Batch Image Editing (CVPR 2024)

Language:PythonStargazers:42Issues:0Issues:0

ShareGPT4Video

An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Language:PythonStargazers:1160Issues:0Issues:0

awesome-cvpr-2024

🤩 An AWESOME Curated List of Papers, Workshops, Datasets, and Challenges from CVPR 2024

Language:PythonLicense:CC0-1.0Stargazers:119Issues:0Issues:0

InternLM

Official release of InternLM2.5 7B base and chat models. 1M context support

Language:PythonLicense:Apache-2.0Stargazers:5847Issues:0Issues:0

ShareGPT4V

[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions

Language:PythonStargazers:63Issues:0Issues:0

T3AL

Official Pytorch implementation of "Test-Time Zero-Shot Temporal Action Localization", CVPR 2024

Language:PythonStargazers:35Issues:0Issues:0

AiOS

[CVPR 2024] Official Code for "AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

Language:PythonLicense:NOASSERTIONStargazers:173Issues:0Issues:0

OpenTAD

OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:109Issues:0Issues:0

MotionDiffuse

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model

Language:PythonLicense:NOASSERTIONStargazers:811Issues:0Issues:0
Stargazers:46Issues:0Issues:0

PoseAnything

A Graph-Based Approach for Category-Agnostic Pose Estimation [ECCV 2024]

Language:PythonLicense:Apache-2.0Stargazers:283Issues:0Issues:0

VideoMAE-Action-Detection

[NeurIPS 2022 Spotlight] VideoMAE for Action Detection

Language:PythonLicense:NOASSERTIONStargazers:47Issues:0Issues:0

AlphAction

Spatio-Temporal Action Localization System

Language:PythonStargazers:396Issues:0Issues:0

hiera

Hiera: A fast, powerful, and simple hierarchical vision transformer.

Language:PythonLicense:Apache-2.0Stargazers:723Issues:0Issues:0

AlphaPose

Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System

Language:PythonLicense:NOASSERTIONStargazers:7856Issues:0Issues:0

STALE

[ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot Temporal Action Detection via Vision-Language Prompting "

Language:PythonStargazers:97Issues:0Issues:0
Language:Jupyter NotebookStargazers:8Issues:0Issues:0

BIKE

【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

Language:PythonLicense:MITStargazers:153Issues:0Issues:0

CaFo

[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

Language:PythonLicense:MITStargazers:335Issues:0Issues:0

VideoMamba

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding

Language:PythonLicense:Apache-2.0Stargazers:708Issues:0Issues:0

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

Language:PythonLicense:MITStargazers:4247Issues:0Issues:0

Long-CLIP

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Language:PythonStargazers:483Issues:0Issues:0

adapt-image-models

[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition

Language:PythonLicense:Apache-2.0Stargazers:257Issues:0Issues:0