Justin Reppert's starred repositories
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
CTranslate2
Fast inference engine for Transformer models
datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
schedule_free
Schedule-Free Optimization in PyTorch
Long-Context
This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.
tensorizer
Module, Model, and Tensor Serialization/Deserialization
scirepeval
SciRepEval benchmark training and evaluation scripts
nccl-tests
NVIDIA NCCL Tests for Distributed Training
learned-sparse-retrieval
Unified Learned Sparse Retrieval Framework
qdrant-lib
Extract core logic from qdrant and make it available as a library.
spark-on-k8s-images
Driver/Executor images for spark-operator