Beast code in Giters

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Apache-2.0000

LLM-FP4

The official implementation of the EMNLP 2023 paper LLM-FP4

MIT000

llm_interview_note

大模型面试题及答案，大模型八股文

000

lm-evaluation-harness

A framework for few-shot evaluation of language models.

MIT000

LoftQ

MIT000

LSQuantization

The PyTorch implementation of Learned Step size Quantization (LSQ) in ICLR2020 (unofficial)

Language:Jupyter Notebook000

Megatron-LM

Ongoing research training transformer models at scale

NOASSERTION000

microxcaling

PyTorch emulation library for Microscaling (MX)-compatible data formats

MIT000

ml_dtypes

A stand-alone implementation of several NumPy dtype extensions used in machine learning.

Apache-2.0000

onnx2torch

Convert ONNX models to PyTorch.

Apache-2.0000

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Apache-2.0000

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

MIT000

qa-lora

Official PyTorch implementation of QA-LoRA

MIT000

QAQ-KVCacheQuantization

QAQ: Quality Adaptive Quantization for LLM KV Cache

Apache-2.0000

QuaRot

Code for QuaRot, an end-to-end 4-bit inference of large language models.

Apache-2.0000

serverchan-demo

Server酱多语言调用实例

MIT000

tiny-asic-4bit-matrix-mul

Tiny matrix multiplication ASIC with 4-bit math

Apache-2.0000

UltraEval

An open source framework for evaluating foundation models.

Apache-2.0000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Apache-2.0000

what-is

Important concepts in numerical linear algebra and related areas

000