piano_123's starred repositories

rwkv.cpp

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

Language:C++License:MITStargazers:1132Issues:0Issues:0

RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonLicense:Apache-2.0Stargazers:12067Issues:0Issues:0
Stargazers:140Issues:0Issues:0

gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Language:PythonLicense:NOASSERTIONStargazers:13035Issues:0Issues:0

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonLicense:MITStargazers:4188Issues:0Issues:0

mlc-llm

Universal LLM Deployment Engine with ML Compilation

Language:PythonLicense:Apache-2.0Stargazers:18090Issues:0Issues:0

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:12885Issues:0Issues:0

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4480Issues:0Issues:0

llama.cpp

LLM inference in C/C++

Language:C++License:MITStargazers:62928Issues:0Issues:0

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonLicense:AGPL-3.0Stargazers:137170Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:12146Issues:0Issues:0

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonLicense:NOASSERTIONStargazers:8148Issues:0Issues:0

x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers

Language:PythonLicense:MITStargazers:4448Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:3660Issues:0Issues:0

Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language:PythonLicense:BSD-3-ClauseStargazers:2630Issues:0Issues:0

tensorrtllm_backend

The Triton TensorRT-LLM Backend

Language:PythonLicense:Apache-2.0Stargazers:622Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7779Issues:0Issues:0

CUDALibrarySamples

CUDA Library Samples

Language:CudaLicense:NOASSERTIONStargazers:1451Issues:0Issues:0

GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ

Language:PythonLicense:Apache-2.0Stargazers:2958Issues:0Issues:0

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Language:PythonLicense:Apache-2.0Stargazers:1813Issues:0Issues:0
Language:PythonStargazers:129Issues:0Issues:0

examples

A set of examples around MegEngine

Language:PythonLicense:Apache-2.0Stargazers:29Issues:0Issues:0

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:2201Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:24103Issues:0Issues:0

ChatGLM-Efficient-Tuning

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

Language:PythonLicense:Apache-2.0Stargazers:3639Issues:0Issues:0

Baichuan2

A series of large language models developed by Baichuan Intelligent Technology

Language:PythonLicense:Apache-2.0Stargazers:4048Issues:0Issues:0

LangChain-Chinese-Getting-Started-Guide

LangChain 的中文入门教程

Stargazers:7183Issues:0Issues:0

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonLicense:MITStargazers:5845Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:132Issues:0Issues:0

ppl.nn

A primitive library for neural network

Language:C++License:Apache-2.0Stargazers:1256Issues:0Issues:0