abhirockzz / kafka-cosmos-keda-vk-aks-sample

Real-time, horizontally scalable data processing application. Components used: Event Hubs, Cosmos DB, Azure Kubernetes Service (with KEDA and Virtual Kubelet), Azure Container Instances

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Real-time, horizontally scalable data processing application in a Serverless way.

Solution overview

A simple order processing application.

Application components:

  • Producer app - Go app that sends simulated orders to Azure Event Hubs Kafka endpoint. Uses Sarama Kafka client
  • Order processor - A Java app that receives order events from Event Hubs and stores them to Azure Cosmos DB. It uses Kafka Java API and Azure Cosmos DB Java SDK v4. Runs in Kubernetes as a Deployment

Infrastructure components (PaaS)

  • Azure Event Hubs Kafka: serves as the ingestion layer (receives orders)
  • Azure Cosmos DB (SQL API): uses for persistence (storing processed order data)
  • Azure Kubernetes Service (AKS): managed Kubernetes offering

Kubernetes components:

  • KEDA: Uses the Kafka scaler to auto-scale Order Processor Deployment. The Kafka scaler works on the basis of topic consumer lag
  • Virtual Kubelet: Uses AKS integration for Virtual Kubelet (Virtual Nodes) to spin up Order Processor Pods on Azure Container Instances (ACI) rather than on AKS nodes - ACI provides "serveless" platform and Virtual Nodes integration enables it

Azure setup

Install KEDA

helm repo add kedacore https://kedacore.github.io/charts
helm repo update

kubectl create namespace keda
helm install keda kedacore/keda --namespace keda

This will install the KEDA Operator and the KEDA Metrics API server (as separate Deployments)

kubectl get deployment -n keda

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
keda-operator                     1/1     1            1           1h
keda-operator-metrics-apiserver   1/1     1            1           1h

To check KEDA Operator logs

kubectl logs -f $(kubectl get pod -l=app=keda-operator -o jsonpath='{.items[0].metadata.name}' -n keda) -n keda

Deploy the solution

We will deploy the consumer (order processor) app first, followed by the KEDA specific components. Start by cloning the repo:

git clone https://github.com/abhirockzz/kafka-cosmos-keda-vk-aks-sample
cd kafka-cosmos-keda-vk-aks-sample

Deploy order processing (consumer) app

Replace the content in the file 1-secret.yaml with base64 encoded values for Event Hubs connection string and Cosmos DB key

Create the Secret, followed by consumer app Deployment

kubectl apply -f deploy/1-secret.yaml
kubectl apply -f deploy/3-consumer.yaml

//check consumer app pod
kubectl get pod -l=app=order-processor

The Pod will actually be created in Azure Container Instances (not AKS), thanks to the Virtual Nodes integration.

Check the container in ACI

Deploy KEDA components

Replace Event Hubs, Cosmos DB and other details in the file 4-kafka-scaledobject.yaml

We need to create KEDA TriggerAuthentication first and then the ScaledObject

kubectl apply -f deploy/2-trigger-auth.yaml
kubectl apply -f deploy/4-kafka-scaledobject.yaml

Check HPA - kubectl get hpa

Check Consumer app again (kubectl get deployment order-processor) - it will be removed by KEDA (the promise of scale to zero)

Test the solution...

Check pods auto-scaling

kubectl get deployment -l=app=order-processor -w

In another terminal, Start producer application - replace the event hubs info in producer.sh file.

To run the producer - ./producer.sh

Check ACI after a while.. you will see container instances spinning up.

Scale down - stop producer app. Wait for a while and the ACI containers will be terminated by KEDA

Clean up

To uninstall KEDA:

helm uninstall -n keda keda
kubectl delete -f https://raw.githubusercontent.com/kedacore/keda/master/deploy/crds/keda.k8s.io_scaledobjects_crd.yaml
kubectl delete -f https://raw.githubusercontent.com/kedacore/keda/master/deploy/crds/keda.k8s.io_triggerauthentications_crd.yaml

To delete the Azure services, just delete the resource group:

az group delete --name <AZURE_RESOURCE_GROUP> --yes --no-wait

About

Real-time, horizontally scalable data processing application. Components used: Event Hubs, Cosmos DB, Azure Kubernetes Service (with KEDA and Virtual Kubelet), Azure Container Instances


Languages

Language:Java 80.7%Language:Go 15.6%Language:Dockerfile 2.2%Language:Shell 1.5%