Yuchong Sun 孙宇冲 (ycsun1972)

ycsun1972

Geek Repo

Company:Renmin University of China

Location:Beijing, China

Github PK Tool:Github PK Tool

Yuchong Sun 孙宇冲's starred repositories

Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Stargazers:321Issues:0Issues:0
Language:Jupyter NotebookStargazers:3Issues:0Issues:0

Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

Stargazers:2909Issues:0Issues:0

Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

Stargazers:1017Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:8231Issues:0Issues:0
Stargazers:1056Issues:0Issues:0

Visual-Instruction-Tuning

SVIT: Scaling up Visual Instruction Tuning

Language:PythonLicense:MITStargazers:154Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:711Issues:0Issues:0

ceval

Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]

Language:PythonLicense:MITStargazers:1560Issues:0Issues:0

Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Language:PythonLicense:MITStargazers:3526Issues:0Issues:0

Valley

The official repository of "Video assistant towards large language model makes everything easy"

Language:PythonStargazers:193Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:10864Issues:0Issues:0

VisionLLM

VisionLLM Series

Language:PythonLicense:Apache-2.0Stargazers:764Issues:0Issues:0

esper

ESPER

Language:PythonStargazers:23Issues:0Issues:0

TestOfTime

Official code for our CVPR 2023 paper: Test of Time: Instilling Video-Language Models with a Sense of Time

Language:PythonLicense:MITStargazers:45Issues:0Issues:0

LLM-in-Vision

Recent LLM-based CV and related works. Welcome to comment/contribute!

Stargazers:802Issues:0Issues:0

Text2Poster-ICASSP-22

Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"

Language:PythonLicense:MITStargazers:201Issues:0Issues:0

FrozenBiLM

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models

Language:PythonLicense:Apache-2.0Stargazers:151Issues:0Issues:0

MMTG

[ACM MM 2022]: Multi-Modal Experience Inspired AI Creation

Language:PythonStargazers:18Issues:0Issues:0

awesome-multimodal-dialogue

Paper, dataset and code list for multimodal dialogue.

License:MITStargazers:18Issues:0Issues:0

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9298Issues:0Issues:0

language-guided-animation

Language-Guided Face Animation by Recurrent StyleGAN-based Generator

Stargazers:18Issues:0Issues:0

Awesome-Computer-Vision-Paper-List

This repository contains all the papers accepted in top conference of computer vision, with convenience to search related papers.

Language:PythonLicense:MITStargazers:634Issues:0Issues:0

TextBox

TextBox 2.0 is a text generation library with pre-trained language models

Language:PythonLicense:MITStargazers:1067Issues:0Issues:0

awesome-audiovisual-learning

A curated list of audio-visual learning methods and datasets.

Stargazers:214Issues:0Issues:0

XPretrain

Multi-modality pre-training

Language:PythonLicense:NOASSERTIONStargazers:456Issues:0Issues:0

awesome-embodied-vision

Reading list for research topics in embodied vision

License:MITStargazers:462Issues:0Issues:0

CVPR2024-Paper-Code-Interpretation

cvpr2024/cvpr2023/cvpr2022/cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集,极市团队整理

Stargazers:12376Issues:0Issues:0

Transformer-in-Vision

Recent Transformer-based CV and related works.

Stargazers:1307Issues:0Issues:0