MitchellX

Mingcan Xiang's starred repositories

glake

GLake: optimizing GPU memory management and IO transmission.

Language:C++Apache-2.030100

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookApache-2.0193700

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++MIT705200

Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

GPL-3.0163200

flash-llm

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Language:CudaApache-2.015000

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause1140700

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.02044000

autoLiterature

autoLiterature是一个基于Python的自动文献管理命令行工具

Language:Python34000

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

979900

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonApache-2.0105300

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Language:Jupyter NotebookMIT821200

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.012709400

donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Language:PythonMIT543000

text-to-text-transfer-transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Language:PythonApache-2.0595100

awesome-mixture-of-experts

A collection of AWESOME things about mixture-of-experts

77300

Awesome-Mixture-of-Experts-Papers

A curated reading list of research in Mixture-of-Experts(MoE).

Apache-2.047700

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonApache-2.01182000

MitchellX

Mingcan Xiang's starred repositories

glake

Medusa

PowerInfer

Awesome-LLM-Inference

bilivideos

flash-llm

flash-attention

vllm

autoLiterature

Awesome-Multimodal-Large-Language-Models

mPLUG-DocOwl

MobiVQA

Transformers-Tutorials

transformers

donut

text-to-text-transfer-transformer

t5x

pix2struct

awesome-mixture-of-experts

Awesome-Mixture-of-Experts-Papers

RWKV-LM

flex-dm

flops-counter.pytorch

M-FAC

rigl

snip

ColossalAI

STR

Efficient-Deep-Learning

hydra