Mingcan Xiang (MitchellX)

MitchellX

Geek Repo

Company:University of Massachusetts Amherst

Location:Amherst, MA, USA

Home Page:https://mitchellx.github.io/

Github PK Tool:Github PK Tool


Organizations
UTSASRG

Mingcan Xiang's starred repositories

glake

GLake: optimizing GPU memory management and IO transmission.

Language:C++License:Apache-2.0Stargazers:301Issues:0Issues:0

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1937Issues:0Issues:0

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++License:MITStargazers:7052Issues:0Issues:0

Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

License:GPL-3.0Stargazers:1632Issues:0Issues:0
Language:Jupyter NotebookStargazers:42Issues:0Issues:0

flash-llm

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Language:CudaLicense:Apache-2.0Stargazers:150Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:11407Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:20440Issues:0Issues:0

autoLiterature

autoLiterature是一个基于Python的自动文献管理命令行工具

Language:PythonStargazers:340Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

Stargazers:9799Issues:0Issues:0

mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Language:PythonLicense:Apache-2.0Stargazers:1053Issues:0Issues:0
Language:PythonStargazers:4Issues:0Issues:0

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Language:Jupyter NotebookLicense:MITStargazers:8212Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:127094Issues:0Issues:0

donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Language:PythonLicense:MITStargazers:5430Issues:0Issues:0

text-to-text-transfer-transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Language:PythonLicense:Apache-2.0Stargazers:5951Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:2543Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:560Issues:0Issues:0

awesome-mixture-of-experts

A collection of AWESOME things about mixture-of-experts

Stargazers:773Issues:0Issues:0

Awesome-Mixture-of-Experts-Papers

A curated reading list of research in Mixture-of-Experts(MoE).

License:Apache-2.0Stargazers:477Issues:0Issues:0

RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonLicense:Apache-2.0Stargazers:11820Issues:0Issues:0

flex-dm

Towards Flexible Multi-modal Document Models [Inoue+, CVPR2023]

Language:PythonLicense:Apache-2.0Stargazers:53Issues:0Issues:0

flops-counter.pytorch

Flops counter for convolutional networks in pytorch framework

Language:PythonLicense:MITStargazers:2710Issues:0Issues:0

M-FAC

Efficient reference implementations of the static & dynamic M-FAC algorithms (for pruning and optimization)

Language:PythonLicense:MITStargazers:16Issues:0Issues:0

rigl

End-to-end training of sparse deep neural networks with little-to-no performance loss.

Language:PythonLicense:Apache-2.0Stargazers:314Issues:0Issues:0

snip

Pytorch implementation of the paper "SNIP: Single-shot Network Pruning based on Connection Sensitivity" by Lee et al.

Language:PythonLicense:MITStargazers:99Issues:0Issues:0

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:38115Issues:0Issues:0

STR

Soft Threshold Weight Reparameterization for Learnable Sparsity

Language:PythonLicense:Apache-2.0Stargazers:84Issues:0Issues:0

Efficient-Deep-Learning

Collection of recent methods on (deep) neural network compression and acceleration.

License:MITStargazers:909Issues:0Issues:0

hydra

Code and checkpoints of compressed networks for the paper titled "HYDRA: Pruning Adversarially Robust Neural Networks" (NeurIPS 2020) (https://arxiv.org/abs/2002.10509).

Language:PythonStargazers:88Issues:0Issues:0