Rongjie Yi's starred repositories

SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

License:Apache-2.0Stargazers:280Issues:0Issues:0

awesome-mixture-of-experts

A collection of AWESOME things about mixture-of-experts

Stargazers:859Issues:0Issues:0

Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

License:GPL-3.0Stargazers:2041Issues:0Issues:0

once-for-all

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment

Language:PythonLicense:MITStargazers:1854Issues:0Issues:0

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:Apache-2.0Stargazers:10946Issues:0Issues:0

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:11855Issues:0Issues:0

MobiLlama

MobiLlama : Small Language Model tailored for edge devices

Language:PythonLicense:Apache-2.0Stargazers:565Issues:0Issues:0

llm_interview_note

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

Language:HTMLStargazers:1612Issues:0Issues:0

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:5186Issues:0Issues:0

FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

Language:C++License:Apache-2.0Stargazers:1605Issues:0Issues:0

AIOS

AIOS: LLM Agent Operating System

Language:PythonLicense:MITStargazers:3058Issues:0Issues:0

weblinx

WebLINX is a benchmark for building web navigation agents with conversational capabilities

Language:PythonLicense:Apache-2.0Stargazers:103Issues:0Issues:0

Mind2Web

[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"

Language:Jupyter NotebookLicense:MITStargazers:621Issues:0Issues:0

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonLicense:MITStargazers:2892Issues:0Issues:0

Awesome-Diffusion-Model-Based-Image-Editing-Methods

Diffusion Model-Based Image Editing: A Survey (arXiv)

License:MITStargazers:371Issues:0Issues:0

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonLicense:MITStargazers:1115Issues:0Issues:0

MobileVLM

Strong and Open Vision Language Assistant for Mobile Devices

Language:PythonLicense:Apache-2.0Stargazers:898Issues:0Issues:0

stable-diffusion.cpp

Stable Diffusion in pure C/C++

Language:C++License:MITStargazers:2942Issues:0Issues:0

Efficient_Foundation_Model_Survey

Survey Paper List - Efficient LLM and Foundation Models

Stargazers:176Issues:0Issues:0

Personal_LLM_Agents_Survey

Paper list for Personal LLM Agents

Stargazers:280Issues:0Issues:0

Awesome-Quantization-Papers

List of papers related to neural network quantization in recent AI conferences and journals.

License:MITStargazers:373Issues:0Issues:0

MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

Language:C++Stargazers:8482Issues:0Issues:0

wenet_mnn

语音识别模型pytorch转ONNX转MNN,C++实现部署

Language:PythonStargazers:34Issues:0Issues:0

rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Language:C++License:Apache-2.0Stargazers:456Issues:0Issues:0

crabml

a fast cross platform AI inference engine 🤖 using Rust 🦀 and WebGPU 🎮

Language:RustLicense:Apache-2.0Stargazers:382Issues:0Issues:0

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonLicense:Apache-2.0Stargazers:1576Issues:0Issues:0

mllm

Fast Multimodal LLM on Mobile Devices

Language:C++License:MITStargazers:279Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:10789Issues:0Issues:0

chroma

the AI-native open-source embedding database

Language:RustLicense:Apache-2.0Stargazers:13743Issues:0Issues:0

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++License:MITStargazers:7676Issues:0Issues:0