hanrui1sensetime

hanrui1sensetime

Geek Repo

Company:OpenMedLab

Location:Shanghai, China

Github PK Tool:Github PK Tool

hanrui1sensetime's starred repositories

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1945Issues:0Issues:0

llmc

This is the official PyTorch implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models", and also an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.

Language:PythonLicense:Apache-2.0Stargazers:104Issues:0Issues:0

llm_interview_note

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

Language:HTMLStargazers:1014Issues:0Issues:0

qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Language:PythonLicense:Apache-2.0Stargazers:278Issues:0Issues:0

Atom

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Language:CudaStargazers:195Issues:0Issues:0

GPTQ-for-PULSE

4 bits quantization of PULSE models using GPTQ

Language:PythonLicense:Apache-2.0Stargazers:2Issues:0Issues:0

QUIK

Repository for the QUIK project, enabling the use of 4bit kernels for generative inference

Language:C++License:Apache-2.0Stargazers:159Issues:0Issues:0

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:10940Issues:0Issues:0

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Language:PythonLicense:MITStargazers:1336Issues:0Issues:0

RETFound_MAE

RETFound - A foundation model for retinal image

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:60Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:18Issues:0Issues:0

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:1940Issues:0Issues:0

awesome-lm-system

Summary of system papers/frameworks/codes/tools on training or serving large model

License:Apache-2.0Stargazers:56Issues:0Issues:0

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:8196Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:119Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:20762Issues:0Issues:0

nndeploy

nndeploy是一款模型端到端部署框架。以多端推理以及基于有向无环图模型部署为基础,致力为用户提供跨平台、简单易用、高性能的模型部署体验。

Language:C++License:Apache-2.0Stargazers:501Issues:0Issues:0

mlc-llm

Universal LLM Deployment Engine with ML Compilation

Language:PythonLicense:Apache-2.0Stargazers:17480Issues:0Issues:0

OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Language:PythonLicense:MITStargazers:604Issues:0Issues:0

FlashAttention20Triton

Triton implementation of Flash Attention2.0

Language:PythonLicense:MITStargazers:15Issues:0Issues:0

RPTQ-for-LLaMA

Efficient 3bit/4bit quantization of LLaMA models

Language:PythonStargazers:18Issues:0Issues:0

FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

Language:PythonLicense:Apache-2.0Stargazers:9049Issues:0Issues:0

Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Language:PythonStargazers:894Issues:0Issues:0

RPTQ4LLM

Reorder-based post-training quantization for large language model

Language:PythonLicense:MITStargazers:175Issues:0Issues:0
Stargazers:107Issues:0Issues:0
Language:PythonStargazers:124Issues:0Issues:0

STU-Net

The largest pre-trained medical image segmentation model (1.4B parameters) based on the largest public dataset (>100k annotations) to date.

License:Apache-2.0Stargazers:111Issues:0Issues:0

MIU-VL

This is a repository for the ICLR2023 accepted paper -- Medical Image Understanding with Pretrained Vision Language Models: A Comprehensive Study.

Stargazers:129Issues:0Issues:0

CITE

[MICCAI'23] Text-guided Foundation Model Adaptation for Pathological Image Classification

Stargazers:144Issues:0Issues:0
Language:PythonStargazers:165Issues:0Issues:0