Antti Puurula's starred repositories

linux

Linux kernel source tree

Language:CLicense:NOASSERTIONStargazers:175312Issues:7959Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:23646Issues:219Issues:3619

LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

FinGPT

FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

Language:Jupyter NotebookLicense:MITStargazers:12729Issues:243Issues:104

qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Language:Jupyter NotebookLicense:MITStargazers:9745Issues:84Issues:247

WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:8472Issues:99Issues:1227

axolotl

Go ahead and axolotl questions

Language:PythonLicense:Apache-2.0Stargazers:6856Issues:50Issues:597

skypilot

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

Language:PythonLicense:Apache-2.0Stargazers:6324Issues:71Issues:1649

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:3513Issues:33Issues:1133

CTranslate2

Fast inference engine for Transformer models

Language:C++License:MITStargazers:3090Issues:57Issues:663

optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Language:PythonLicense:Apache-2.0Stargazers:2342Issues:59Issues:708

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:2169Issues:24Issues:159

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:2118Issues:24Issues:174

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonLicense:Apache-2.0Stargazers:1781Issues:41Issues:288

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Language:PythonLicense:MITStargazers:1489Issues:12Issues:343

evalplus

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023

Language:PythonLicense:Apache-2.0Stargazers:1071Issues:7Issues:163

llmtools

Finetuning Large Language Models on One Consumer GPU in Under 4 Bits

attention_sinks

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining

Language:PythonLicense:Apache-2.0Stargazers:649Issues:12Issues:29

Wordbatch

Python library for distributed AI processing pipelines, using swappable scheduler backends.

Language:PythonLicense:GPL-2.0Stargazers:413Issues:11Issues:31

langstream

Build robust LLM applications with true composability 🔗

Language:PythonLicense:MITStargazers:401Issues:7Issues:7

speculative-decoding

Explorations into some recent techniques surrounding speculative decoding

Language:PythonLicense:MITStargazers:176Issues:8Issues:2

text-generation-inference

IBM development fork of https://github.com/huggingface/text-generation-inference

Language:PythonLicense:Apache-2.0Stargazers:47Issues:15Issues:15
Language:PythonLicense:Apache-2.0Stargazers:46Issues:3Issues:0

hf-hub-ctranslate2

Connecting Transformers on HuggingFace Hub with CTranslate2

Language:PythonLicense:MITStargazers:32Issues:2Issues:12

Pytorch_Merge

Merge LLM that are split in to parts

Language:PythonLicense:GPL-3.0Stargazers:23Issues:0Issues:0

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:9Issues:2Issues:0

PromptGPT

Prompt Engineer Agent + (GPT-4,) = PromptGPT. Autonomous, self-supportive & objective-agnostic agents.

Language:Jupyter NotebookLicense:MITStargazers:7Issues:1Issues:0

open-text-generation-inference

Open Large Language Model Text Generation Inference - will remain Apache-2.0

Language:PythonLicense:Apache-2.0Stargazers:4Issues:1Issues:0