luome

Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Language:PythonNOASSERTION1155000

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++MIT779400

llama.cpp

LLM inference in C/C++

Language:C++MIT6386600

ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

Language:GoMIT8615500

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0797600

aimoneyhunter

ai副业赚钱大集合，教你如何利用ai做一些副业项目，赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English version for more insights.

1278800

llm-inference-benchmark

LLM Inference benchmark

Language:PythonMIT31500

RRHF

[NIPS2023] RRHF & Wombat

Language:Python78600

lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Bedrock / Azure / Mistral / DeepSeek),Knowledge Base(file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.

Language:TypeScriptNOASSERTION3734600

luome

Jie's starred repositories

mistral-inference

grok-1

flash-attention-minimal

algebraic-nnhw

peft

jsonformer

outlines

induced-rationales-markup-tokens

knnlm

LWM

minbpe

GPT-SoVITS

Bert-VITS2

human-eval

chroma

litellm

ml-ferret

GPTs

PowerInfer

llama.cpp

ollama

TensorRT-LLM

aimoneyhunter

llm-inference-benchmark

RRHF

lobe-chat

stanford_alpaca

mlx-examples

direct-preference-optimization

gpt-fast