zzxxxl's starred repositories

MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Language:PythonLicense:MITStargazers:2739Issues:0Issues:0

refusal_direction

Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".

Language:PythonLicense:Apache-2.0Stargazers:80Issues:0Issues:0

einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Language:PythonLicense:MITStargazers:8388Issues:0Issues:0

mllm

Fast Multimodal LLM on Mobile Devices

Language:C++License:MITStargazers:419Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:4289Issues:0Issues:0

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++License:MITStargazers:7895Issues:0Issues:0

mnn-llm

llm deploy project based mnn.

Language:C++License:Apache-2.0Stargazers:1429Issues:0Issues:0

maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

Language:DartLicense:MITStargazers:1287Issues:0Issues:0

InferLLM

a lightweight LLM model inference framework

Language:C++License:Apache-2.0Stargazers:682Issues:0Issues:0

llm-applications

A comprehensive guide to building RAG-based LLM applications for production.

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:1677Issues:0Issues:0

TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

Language:C++License:MITStargazers:708Issues:0Issues:0

mlc-llm

Universal LLM Deployment Engine with ML Compilation

Language:PythonLicense:Apache-2.0Stargazers:18720Issues:0Issues:0

MergeLM

Codebase for Merging Language Models (ICML 2024)

Language:PythonStargazers:749Issues:0Issues:0

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookLicense:MITStargazers:13176Issues:0Issues:0

micrograd

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Language:Jupyter NotebookLicense:MITStargazers:10096Issues:0Issues:0
Language:C++License:NOASSERTIONStargazers:262Issues:0Issues:0

fastllm

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

Language:C++License:Apache-2.0Stargazers:3290Issues:0Issues:0

MiniCPM

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:6936Issues:0Issues:0

Qwen2.5

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:ShellStargazers:8499Issues:0Issues:0

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:5302Issues:0Issues:0
Language:PythonStargazers:272Issues:0Issues:0

dpo_toxic

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.

Language:Jupyter NotebookLicense:MITStargazers:45Issues:0Issues:0

ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Language:GoLicense:MITStargazers:90941Issues:0Issues:0

MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Language:PythonLicense:Apache-2.0Stargazers:1975Issues:0Issues:0

FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

Language:C++License:Apache-2.0Stargazers:1656Issues:0Issues:0

LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Language:PythonLicense:Apache-2.0Stargazers:521Issues:0Issues:0

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Language:PythonLicense:Apache-2.0Stargazers:1884Issues:0Issues:0

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5549Issues:0Issues:0

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:2313Issues:0Issues:0