Robin's repositories

aphrodite-engine

PygmalionAI's large-scale inference engine

Language:PythonLicense:AGPL-3.0Stargazers:0Issues:0Issues:0

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

cuda-toolkit

GitHub Action to install CUDA

Language:TypeScriptLicense:MITStargazers:0Issues:0Issues:0
Language:NunjucksLicense:MITStargazers:0Issues:0Issues:0

GetOldTweets3

A Python 3 library and a corresponding command line utility for accessing old tweets

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

langfuse

🪢 Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

Language:TypeScriptLicense:NOASSERTIONStargazers:0Issues:0Issues:0

llama-cpp-python

Python bindings for llama.cpp

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Language:C++License:MITStargazers:0Issues:0Issues:0

text-embeddings-inference

A blazing fast inference solution for text embeddings models

Language:RustLicense:NOASSERTIONStargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0