Deploying a Hugging Face Model on RunPod

Runpod is a platform that allows you to deploy and run your machine learning models in the cloud. It provides a simple interface to deploy your models and manage the resources required to run them. In this example, we will deploy a Hugging Face model on RunPod.

More about RunPod in this documentation.

In this example, the tasks done are:

Package a Hugging Face model into a Docker container.
Deploy it on RunPod.

Build and Test Locally

# Build Docker image
docker build -t embedding-app .

# Run the container locally
docker run -p 8000:8000 embedding-app

Access the API at http://localhost:8000/docs.

Deploy on RunPod

Step 1: Push Docker Image to a Registry

Tag the image:

docker tag embedding-app YOUR_DOCKER_USERNAME/embedding-app

Push to Docker Hub:

docker push YOUR_DOCKER_USERNAME/embedding-app

Step 2: Create a RunPod Instance

Log in to RunPod.
Select a GPU and create a pod.
In the container settings:
- Specify your Docker image (YOUR_DOCKER_USERNAME/embedding-app).
- Map the application port (e.g., 8000).

Step 3: Deploy and Test

Launch the pod.
Use the public endpoint provided by RunPod to test the API.

sthsuyash / Runpod-beginner