Lee's starred repositories

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:17414Issues:0Issues:0

video2dataset

Easily create large video dataset from video urls

Language:PythonLicense:MITStargazers:478Issues:0Issues:0

VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

Language:PythonLicense:Apache-2.0Stargazers:341Issues:0Issues:0

PLLaVA

Official repository for the paper PLLaVA

Language:PythonStargazers:417Issues:0Issues:0

MAD

MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions

Language:PythonLicense:MITStargazers:138Issues:0Issues:0

AVE

This is the official repository for our ECCV 2022 paper titled, "The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing"

Language:PythonStargazers:39Issues:0Issues:0

Awesome-Opus-Long-Video-Understanding

Awesome research works specifically focused on long form videos understanding.

License:MITStargazers:6Issues:0Issues:0

movienet-tools

Tools for movie and video research

Language:C++Stargazers:257Issues:0Issues:0

ChatVID

Chat about anything on any video!

Language:PythonLicense:MITStargazers:33Issues:0Issues:0

DynMoE

[Preprint] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

Language:PythonLicense:Apache-2.0Stargazers:18Issues:0Issues:0

zju-learning-assistant

帮你快速下载所有课件😋

Language:RustLicense:MITStargazers:193Issues:0Issues:0

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:1021Issues:0Issues:0

MM-VID

Open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".

Language:PythonLicense:MITStargazers:6Issues:0Issues:0

MovieChat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

Language:PythonLicense:BSD-3-ClauseStargazers:435Issues:0Issues:0

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

Stargazers:820Issues:0Issues:0

VLog

Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.

Language:PythonLicense:MITStargazers:502Issues:0Issues:0

trx

Temporal-Relational CrossTransformers (CVPR 2021)

Language:PythonStargazers:105Issues:0Issues:0

AliceMind

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Language:PythonLicense:Apache-2.0Stargazers:1958Issues:0Issues:0

kinetics-downloader

Download DeepMind's Kinetics dataset.

Language:PythonLicense:MITStargazers:258Issues:0Issues:0

InternVideo

Video Foundation Models & Data for Multimodal Understanding

Language:PythonLicense:Apache-2.0Stargazers:1048Issues:0Issues:0

all-in-one

[CVPR2023] All in One: Exploring Unified Video-Language Pre-training

Language:PythonStargazers:274Issues:0Issues:0

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9046Issues:0Issues:0

Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Language:PythonStargazers:415Issues:0Issues:0

XPretrain

Multi-modality pre-training

Language:PythonLicense:NOASSERTIONStargazers:448Issues:0Issues:0

VideoX

VideoX: a collection of video cross-modal models

Language:PythonLicense:NOASSERTIONStargazers:937Issues:0Issues:0

PromptSRC

[ICCV'23 Main Track, WECIA'23 Oral] Official repository of paper titled "Self-regulating Prompts: Foundational Model Adaptation without Forgetting".

Language:PythonLicense:MITStargazers:192Issues:0Issues:0

CVPR2024-Papers-with-Code

CVPR 2024 论文和开源项目合集

Stargazers:16731Issues:0Issues:0

Awesome-Mamba-Collection

A curated collection of papers, tutorials, videos, and other valuable resources related to Mamba.

License:MITStargazers:190Issues:0Issues:0

Interview-for-Algorithm-Engineer

【三年面试五年模拟】算法工程师秘籍。AIGC、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、图像处理、元宇宙、SLAM等AI行业面试笔试经验分享

License:GPL-3.0Stargazers:187Issues:0Issues:0

PhotoMaker

PhotoMaker

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:8529Issues:0Issues:0