VincentDENG's starred repositories

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonLicense:MITStargazers:2871Issues:0Issues:0

Video-Bench

A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!

Language:PythonStargazers:108Issues:0Issues:0

MLVU

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark

Language:PythonStargazers:89Issues:0Issues:0

LongVA

Long Context Transfer from Language to Vision

Language:PythonLicense:Apache-2.0Stargazers:195Issues:0Issues:0

open_clip

An open source implementation of CLIP.

Language:PythonLicense:NOASSERTIONStargazers:9160Issues:0Issues:0

Kolors

Kolors Team

Language:PythonLicense:Apache-2.0Stargazers:1824Issues:0Issues:0

HallusionBench

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Language:PythonLicense:BSD-3-ClauseStargazers:205Issues:0Issues:0

MathVista

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

Language:Jupyter NotebookLicense:CC-BY-SA-4.0Stargazers:203Issues:0Issues:0

MM-Vet

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)

Language:PythonLicense:Apache-2.0Stargazers:204Issues:0Issues:0

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Language:PythonStargazers:2190Issues:0Issues:0

GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Language:PythonLicense:Apache-2.0Stargazers:3600Issues:0Issues:0

Video-Streaming-Research-Papers

Research materials about multimedia network and system, including paper list, tools, etc.

Stargazers:63Issues:0Issues:0

llama

Inference code for Llama models

Language:PythonLicense:NOASSERTIONStargazers:54205Issues:0Issues:0

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonLicense:MITStargazers:11091Issues:0Issues:0

Step-DPO

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Language:PythonStargazers:143Issues:0Issues:0

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:1501Issues:0Issues:0

LLM101n

LLM101n: Let's build a Storyteller

Stargazers:15323Issues:0Issues:0

fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.

Language:PythonLicense:Apache-2.0Stargazers:118Issues:0Issues:0

Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Stargazers:300Issues:0Issues:0

litdata

Transform datasets at scale. Optimize datasets for fast AI model training.

Language:PythonLicense:Apache-2.0Stargazers:255Issues:0Issues:0

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonLicense:Apache-2.0Stargazers:1732Issues:0Issues:0

MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

Stargazers:93Issues:0Issues:0

DynamiCrafter

[ECCV 2024] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Language:PythonLicense:Apache-2.0Stargazers:2091Issues:0Issues:0

minisora

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Language:PythonLicense:Apache-2.0Stargazers:1112Issues:0Issues:0

OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Language:PythonStargazers:1300Issues:0Issues:0

bolei_awesome_posters

CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!

Stargazers:1318Issues:0Issues:0

LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Language:PythonLicense:MITStargazers:1000Issues:0Issues:0
Language:PythonLicense:BSD-3-ClauseStargazers:91Issues:0Issues:0

MixEval

The official evaluation suite and dynamic data release for MixEval.

Language:PythonStargazers:167Issues:0Issues:0

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:ShellStargazers:6082Issues:0Issues:0