Lyu Han's starred repositories
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
chatgpt-mirai-qq-bot
🚀 一键部署!真正的 AI 聊天机器人!支持ChatGPT、文心一言、讯飞星火、Bing、Bard、ChatGLM、POE,多账号,人设调教,虚拟女仆、图片渲染、语音发送 | 支持 QQ、Telegram、Discord、微信 等平台
mistral-src
Reference implementation of Mistral AI 7B v0.1 model.
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
HuixiangDou
HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance
smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
torchsparse
[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.