Lyu Han (lvhan028)

lvhan028

Geek Repo

Location:China

Github PK Tool:Github PK Tool

Lyu Han's starred repositories

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookLicense:MITStargazers:93715Issues:686Issues:7751

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:35149Issues:343Issues:2768

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:28792Issues:241Issues:4911

mlc-llm

Universal LLM Deployment Engine with ML Compilation

Language:PythonLicense:Apache-2.0Stargazers:18988Issues:173Issues:1379

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language:PythonLicense:Apache-2.0Stargazers:18299Issues:185Issues:731

chatgpt-mirai-qq-bot

🚀 一键部署!真正的 AI 聊天机器人!支持ChatGPT、文心一言、讯飞星火、Bing、Bard、ChatGLM、POE,多账号,人设调教,虚拟女仆、图片渲染、语音发送 | 支持 QQ、Telegram、Discord、微信 等平台

Language:PythonLicense:AGPL-3.0Stargazers:13149Issues:72Issues:1050

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:12936Issues:100Issues:516

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonLicense:MITStargazers:12167Issues:168Issues:236

MOSS

An open-source tool-augmented conversational language model from Fudan University

Language:PythonLicense:Apache-2.0Stargazers:11741Issues:123Issues:352

mistral-src

Reference implementation of Mistral AI 7B v0.1 model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:8772Issues:116Issues:115

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:8449Issues:92Issues:1891

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++License:MITStargazers:7928Issues:78Issues:163

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6612Issues:65Issues:82

InternLM

Official release of InternLM2.5 base and chat models. 1M context support

Language:PythonLicense:Apache-2.0Stargazers:6325Issues:55Issues:332

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:5658Issues:56Issues:571

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Language:PythonLicense:Apache-2.0Stargazers:4791Issues:49Issues:291

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:4459Issues:37Issues:1432

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonLicense:Apache-2.0Stargazers:3934Issues:24Issues:533

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonLicense:Apache-2.0Stargazers:3859Issues:34Issues:526

fastllm

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

Language:C++License:Apache-2.0Stargazers:3301Issues:41Issues:364

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:2522Issues:23Issues:181

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2261Issues:32Issues:88

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonLicense:Apache-2.0Stargazers:1869Issues:41Issues:302

lagent

A lightweight framework for building LLM-based agents

Language:PythonLicense:Apache-2.0Stargazers:1800Issues:17Issues:63

HuixiangDou

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance

Language:PythonLicense:BSD-3-ClauseStargazers:1481Issues:23Issues:36

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonLicense:MITStargazers:1218Issues:21Issues:87

torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Language:CudaLicense:MITStargazers:1204Issues:16Issues:257

GPT4RoI

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Language:PythonLicense:NOASSERTIONStargazers:503Issues:8Issues:47

OpenAOE

LLM Group Chat Framework: chat with multiple LLMs at the same time. 大模型群聊框架:同时与多个大语言模型聊天。

Language:TypeScriptLicense:Apache-2.0Stargazers:242Issues:6Issues:8