ChrisDryden / litellm

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

Home Page:https://docs.litellm.ai/docs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

πŸš… LiteLLM

Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, Cohere, TogetherAI, Azure, OpenAI, etc.]

Evaluate LLMs β†’ OpenAI-Compatible Server

PyPI Version CircleCI Y Combinator W23 Whatsapp Discord

LiteLLM manages

  • Translating inputs to the provider's completion and embedding endpoints
  • Guarantees consistent output, text responses will always be available at ['choices'][0]['message']['content']
  • Exception mapping - common exceptions across providers are mapped to the OpenAI exception types.

10/05/2023: LiteLLM is adopting Semantic Versioning for all commits. Learn more
10/16/2023: Self-hosted OpenAI-proxy server Learn more

Usage (Docs)

Open In Colab
pip install litellm
from litellm import completion
import os

## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-openai-key" 
os.environ["COHERE_API_KEY"] = "your-cohere-key" 

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages=messages)
print(response)

Streaming (Docs)

liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response.
Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.)

from litellm import completion
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for chunk in response:
    print(chunk['choices'][0]['delta'])

# claude 2
result = completion('claude-2', messages, stream=True)
for chunk in result:
  print(chunk['choices'][0]['delta'])

Reliability - Fallback LLMs

Never fail a request using LiteLLM

from litellm import completion
# if gpt-4 fails, retry the request with gpt-3.5-turbo->command-nightly->claude-instant-1
response = completion(model="gpt-4",messages=messages, fallbacks=["gpt-3.5-turbo", "command-nightly", "claude-instant-1"])

# if azure/gpt-4 fails, retry the request with fallback api_keys/api_base
response = completion(model="azure/gpt-4", messages=messages, api_key=api_key, fallbacks=[{"api_key": "good-key-1"}, {"api_key": "good-key-2", "api_base": "good-api-base-2"}])

Logging Observability - Log LLM Input/Output (Docs)

LiteLLM exposes pre defined callbacks to send data to LLMonitor, Langfuse, Helicone, Promptlayer, Traceloop, Slack

from litellm import completion

## set env variables for logging tools
os.environ["PROMPTLAYER_API_KEY"] = "your-promptlayer-key"
os.environ["LLMONITOR_APP_ID"] = "your-llmonitor-app-id"

os.environ["OPENAI_API_KEY"]

# set callbacks
litellm.success_callback = ["promptlayer", "llmonitor"] # log input/output to promptlayer, llmonitor, supabase

#openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi πŸ‘‹ - i'm openai"}])

Supported Provider (Docs)

Provider Completion Streaming Async Completion Async Streaming
openai βœ… βœ… βœ… βœ…
azure βœ… βœ… βœ… βœ…
aws - sagemaker βœ… βœ… βœ… βœ…
aws - bedrock βœ… βœ… βœ… βœ…
cohere βœ… βœ… βœ… βœ…
anthropic βœ… βœ… βœ… βœ…
huggingface βœ… βœ… βœ… βœ…
replicate βœ… βœ… βœ… βœ…
together_ai βœ… βœ… βœ… βœ…
openrouter βœ… βœ… βœ… βœ…
google - vertex_ai βœ… βœ… βœ… βœ…
google - palm βœ… βœ… βœ… βœ…
ai21 βœ… βœ… βœ… βœ…
baseten βœ… βœ… βœ… βœ…
vllm βœ… βœ… βœ… βœ…
nlp_cloud βœ… βœ… βœ… βœ…
aleph alpha βœ… βœ… βœ… βœ…
petals βœ… βœ… βœ… βœ…
ollama βœ… βœ… βœ… βœ…
deepinfra βœ… βœ… βœ… βœ…
perplexity-ai βœ… βœ… βœ… βœ…
anyscale βœ… βœ… βœ… βœ…

Read the Docs

Contributing

To contribute: Clone the repo locally -> Make a change -> Submit a PR with the change.

Here's how to modify the repo locally: Step 1: Clone the repo

git clone https://github.com/BerriAI/litellm.git

Step 2: Navigate into the project, and install dependencies:

cd litellm
poetry install

Step 3: Test your change:

cd litellm/tests # pwd: Documents/litellm/litellm/tests
pytest .

Step 4: Submit a PR with your changes! πŸš€

  • push your fork to your GitHub repo
  • submit a PR from there

Support / talk with founders

Why did we build this

  • Need for simplicity: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI and Cohere.

Contributors

About

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

https://docs.litellm.ai/docs/

License:MIT License


Languages

Language:Python 99.9%Language:Dockerfile 0.1%Language:Shell 0.0%