Robin's repositories
aphrodite-engine
PygmalionAI's large-scale inference engine
Language:PythonAGPL-3.0000
AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.
Language:PythonMIT000
cuda-toolkit
GitHub Action to install CUDA
Language:TypeScriptMIT000
Language:NunjucksMIT000
GetOldTweets3
A Python 3 library and a corresponding command line utility for accessing old tweets
langfuse
🪢 Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
Language:TypeScriptNOASSERTION000
llama-cpp-python
Python bindings for llama.cpp
Language:PythonMIT000
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Language:C++MIT000
text-embeddings-inference
A blazing fast inference solution for text embeddings models
Language:RustNOASSERTION000
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:PythonApache-2.0000