reppertj

Justin Reppert's starred repositories

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.027872 228 4698

guidance

A guidance language for controlling large language models.

Language:Jupyter NotebookMIT18815 117 530

mlc-llm

Universal LLM Deployment Engine with ML Compilation

Language:PythonApache-2.018813 172 1363

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonMIT17508 142 745

litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Language:PythonApache-2.010207 92 760

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.08335 90 1833

axolotl

Go ahead and axolotl questions

Language:PythonApache-2.06856 50 597

skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

Language:PythonApache-2.06630 70 1744

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonApache-2.04344 35 1398

xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Language:PythonApache-2.03807 34 516

CTranslate2

Fast inference engine for Transformer models

Language:C++MIT3276 57 693

datasketch

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Language:PythonMIT2529 48 165

datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Language:PythonApache-2.01971 45 125

schedule_free

Schedule-Free Optimization in PyTorch

Language:PythonApache-2.01835 15 30

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonApache-2.01154 39 76

pyvene

Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions

Language:PythonApache-2.0608 9 60

FLARE

Forward-Looking Active REtrieval-augmented generation (FLARE)

Language:PythonMIT579 7 22

Long-Context

This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.

Language:PythonApache-2.0578 13 6

reppertj

Justin Reppert's starred repositories

vllm

guidance

mlc-llm

dspy

litgpt

OpenChatKit

TensorRT-LLM

axolotl

skypilot

lmdeploy

xtuner

CTranslate2

datasketch

datatrove

schedule_free

nanotron

pyvene

FLARE

Long-Context

ranx

torchx

clownfish

tensorizer

magix

scirepeval

nccl-tests

learned-sparse-retrieval

qdrant-lib

ml-containers

spark-on-k8s-images