Rubeus streamlines API requests to 20+ LLMs. It provides a unified API signature for interacting with all LLMs alongwith powerful LLM Gateway features like load balancing, fallbacks, retries and more.
- π Interoperability: Write once, run with any provider. Switch between __ models from __ providers seamlessly.
- π Fallback Strategies: Don't let failures stop you. If one provider fails, Rubeus can automatically switch to another.
- π Retry Strategies: Temporary issues shouldn't mean manual re-runs. Rubeus can automatically retry failed requests.
- βοΈ Load Balancing: Distribute load effectively across multiple API keys or providers based on custom weights.
- π Unified API Signature: If you've used OpenAI, you already know how to use Rubeus with any other provider.
npm install
npm run dev # To run locally
npm run deploy # To deploy to cloudflare
Rubeus allows you to switch between different language learning models from various providers, making it a highly flexible tool. The following example shows a request to openai
, but you could change the provider name to cohere
, anthropic
or others and Rubeus will automatically handle everything else.
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"provider": "openai",
"api_key: "<open-ai-api-key-here>"
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"model": "text-davinci-003",
"user": "jbu3470"
}
}'
In case one provider fails, Rubeus is designed to automatically switch to another, ensuring uninterrupted service.
# Fallback to anthropic, if openai fails (This API will use the default text-davinci-003 and claude-v1 models)
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"mode": "fallback",
"options": [
{
"provider": "openai",
"api_key": "<open-ai-api-key-here>"
},
{
"provider": "anthropic",
"api_key": "<anthropic-api-key-here>"
}
]
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"user": "jbu3470"
}
}'
# Fallback to gpt-3.5-turbo when gpt-4 fails
curl --location 'http://127.0.0.1:8787/v1/chatComplete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"mode": "fallback",
"options": [
{
"provider": "openai",
"override_params": {"model": "gpt-4"},
"api_key": "<open-ai-api-key-here>"
},
{
"provider": "openai",
"override_params": {"model": "gpt-3.5-turbo"},
"api_key": "<open-ai-api-key-here>"
}
]
},
"params": {
"messages": [{"role": "user", "content": "What are the top 10 happiest countries in the world?"}],
"max_tokens": 50,
"user": "jbu3470"
}
}'
Rubeus has a built-in mechanism to retry failed requests, eliminating the need for manual re-runs.
# Add the retry configuration to enable exponential back-off retries
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"mode": "single",
"options": [{
"provider": "openai",
"retry": {
"attempts": 3,
"on_status_codes": [429,500,504,524]
},
"api_key": "<open-ai-api-key-here>"
}]
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"model": "text-davinci-003",
"user": "jbu3470"
}
}'
Manage your workload effectively with Rubeus's custom weight-based distribution across multiple API keys or providers.
# Load balance 50-50 between gpt-3.5-turbo and claude-v1
curl --location 'http://127.0.0.1:8787/v1/chatComplete' \
--header 'Content-Type: application/json' \
--data '{
"config": {
"mode": "loadbalance",
"options": [{
"provider": "openai",
"weight": 0.5,
"override_params": { "model": "gpt-3.5-turbo" },
"api_key": "<open-ai-api-key-here>"
}, {
"provider": "anthropic",
"weight": 0.5,
"override_params": { "model": "claude-v1" },
"api_key": "<anthropic-api-key-here>"
}]
},
"params": {
"messages": [{"role": "user","content":"What are the top 10 happiest countries in the world?"}],
"max_tokens": 50,
"user": "jbu3470"
}
}'
If you're familiar with OpenAI's API, you'll find Rubeus's API easy to use due to its unified signature.
# OpenAI query
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"provider": "openai",
"api_key": "<open-ai-api-key-here>"
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"user": "jbu3470"
}
}'
# Anthropic Query
curl --location 'http://127.0.0.1:8787/v1/complete' \
--header 'Content-Type: application/json' \
--data-raw '{
"config": {
"provider": "anthropic",
"api_key": "<anthropic-api-key-here>"
},
"params": {
"prompt": "What are the top 10 happiest countries in the world?",
"max_tokens": 50,
"user": "jbu3470"
}
}'
Name | Description |
---|---|
Portkey.ai | Full Stack LLMOps |
- Support for more providers, including Google Bard and LocalAI.
- Enhanced load balancing features to optimize resource use across different models and providers.
- More robust fallback and retry strategies to further improve the reliability of requests.
- Increased customizability of the unified API signature to cater to more diverse use cases.
π¬ Participate in Roadmap discussions here.
The easiest way to contribute is to pick any issue with the good first issue
tag πͺ. Read the Contributing guidelines here.
Bug Report? File here | Feature Request? File here
Rubeus is licensed under the MIT License. See the LICENSE file for more details.