linyueqian

Yueqian Lin's starred repositories

VideoLISA

[NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos

600

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonApache-2.02938700

LLaVA-PruMerge

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Language:PythonApache-2.09300

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonApache-2.0518700

Bay-CAT

[ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios

Language:PythonApache-2.03600

NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Language:PythonBSD-3-Clause322300

LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Language:PythonApache-2.0208900

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.02768100

Call-for-Reviewers

This project aims to collect the latest "call for reviewers" links from various top CS/ML/AI conferences/journals

MIT34300

LinFusion

Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"

Language:PythonApache-2.021000

VCD

[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

Language:PythonApache-2.018300

SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

Apache-2.037400

UTMOS22

UT-Sarulab MOS prediction system using SSL models

Language:PythonMIT16900

WavTokenizer

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

Language:PythonMIT68100

Show-o

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Language:PythonApache-2.087800

Awesome-Unified-Multimodal-Models

📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.

14800

RoFormer_pytorch

RoFormer V1 & V2 pytorch

Language:PythonApache-2.046700

Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.

Language:Jupyter Notebook1193400

linyueqian

Yueqian Lin's starred repositories

VideoLISA

stanford_alpaca

LLaVA-PruMerge

CosyVoice

moshi

Bay-CAT

unified-io-2

NExT-GPT

LLaMA-Omni

vllm

Call-for-Reviewers

LinFusion

VCD

SpeculativeDecodingPapers

UTMOS22

WavTokenizer

Show-o

Awesome-Unified-Multimodal-Models

RoFormer_pytorch

llama-recipes

llama-stack

llama-models

modin

kedro

AI-System-School

LLaMA-Factory

homeassistant-smartrent

icloud_photos_downloader

lmms-finetune

Awesome-OOD-VLM