Runpod is a platform that allows you to deploy and run your machine learning models in the cloud. It provides a simple interface to deploy your models and manage the resources required to run them. In this example, we will deploy a Hugging Face model on RunPod.
More about RunPod in this documentation.
In this example, the tasks done are:
- Package a Hugging Face model into a Docker container.
- Deploy it on RunPod.
# Build Docker image
docker build -t embedding-app .
# Run the container locally
docker run -p 8000:8000 embedding-appAccess the API at http://localhost:8000/docs.
-
Tag the image:
docker tag embedding-app YOUR_DOCKER_USERNAME/embedding-app
-
Push to Docker Hub:
docker push YOUR_DOCKER_USERNAME/embedding-app
- Log in to RunPod.
- Select a GPU and create a pod.
- In the container settings:
- Specify your Docker image (
YOUR_DOCKER_USERNAME/embedding-app). - Map the application port (e.g.,
8000).
- Specify your Docker image (
- Launch the pod.
- Use the public endpoint provided by RunPod to test the API.