install • documentation • examples • we're hiring • chat with us
- Deploy TensorFlow, PyTorch, ONNX, scikit-learn, and other models.
- Define preprocessing and postprocessing steps in Python.
- Configure APIs as realtime or batch.
- Deploy multiple models per API.
- Monitor API performance and track predictions.
- Update APIs with no downtime.
- Stream logs from APIs.
- Perform A/B tests.
- Test locally, scale on your AWS account.
- Autoscale to handle production traffic.
- Reduce cost with spot instances.
Define any real-time or batch inference pipeline as simple Python APIs, regardless of framework.
# predictor.py
from transformers import pipeline
class PythonPredictor:
def __init__(self, config):
self.model = pipeline(task="text-generation")
def predict(self, payload):
return self.model(payload["text"])[0]
Configure autoscaling, monitoring, compute resources, update strategies, and more.
# cortex.yaml
- name: text-generator
predictor:
path: predictor.py
networking:
api_gateway: public
compute:
gpu: 1
autoscaling:
min_replicas: 3
Handle traffic with request-based autoscaling. Minimize spend with spot instances and multi-model APIs.
$ cortex get text-generator
endpoint: https://example.com/text-generator
status last-update replicas requests latency
live 10h 10 100000 100ms
Integrate Cortex with any data science platform and CI/CD tooling, without changing your workflow.
# predictor.py
import tensorflow
import torch
import transformers
import mlflow
...
Run Cortex on your AWS account (GCP support is coming soon), maintaining control over resource utilization and data access.
# cluster.yaml
region: us-west-2
instance_type: g4dn.xlarge
spot: true
min_instances: 1
max_instances: 5
You don't need to bring your own cluster or containerize your models, Cortex automates your cloud infrastructure.
$ cortex cluster up
confguring networking ...
configuring logging ...
configuring metrics ...
configuring autoscaling ...
cortex is ready!
bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.20/get-cli.sh)"
See our installation guide, then deploy one of our examples or bring your own models to build realtime APIs and batch APIs.