Kafka as a Service

Barnabas provides a way to run an Apache Kafka cluster on Kubernetes or OpenShift in various deployment configurations.

Kafka as a Service

Kafka Stateful Sets

This deployment uses the StatefulSets (previously known as "PetSets") feature of Kubernetes/OpenShift. With StatefulSets, the pods receive a unique name and network identity and that makes it easier to identify the individual Kafka broker pods and set their identity (broker ID). Each Kafka broker pod is using its own PersistentVolume. The PersistentVolume is acquired using PersistentVolumeClaim – that makes it independent on the actual type of the PersistentVolume. For example, it can use HostPath volumes on Minikube or Amazon EBS volumes in Amazon AWS deployments without any changes in the YAML files.

It's important to say that in this deployment both regular and headless services are used:

regular services can be used as bootstrap servers for Kafka clients;
headless services are needed to have DNS resolve the pods IP addresses directly.

This deployment is available under the kafka-statefulsets folder and provides following artifacts:

Dockerfile : Docker file for building an image with Kafka and Zookeeper already installed
config : configuration file templates for running Zookeeper
scripts : scripts for starting up Kafka and Zookeeper servers
resources : provides all YAML configuration files for setting up volumes, services and deployments

Deploying to OpenShift

With the Openshift client tools installed and having logged into your cluster, create the provided "barnabas" template by running
```
 oc create -f kafka-statefulsets/resources/openshift-template.yaml
```
in your terminal. This template provides the "zookeeper" StatefulSet with a single replica, the "kafka" StatefulSet with 3 replicas, and the "zookeeper", "zookeeper-headless", "kafka" and "kafka-headless" Services.
Create a new app using the "barnabas" template:
```
 oc new-app barnabas
```

Deploying to Kubernetes

If you don't have a cluster running, start one (e.g. by executing minikube start)
If your cluster doesn't have any default storage class, create the persistent volumes manually
```
 kubectl apply -f kafka-statefulsets/resources/cluster-volumes.yaml
```

Create the services by running:

 kubectl apply -f kafka-statefulsets/resources/zookeeper.yaml && \
 kubectl apply -f kafka-statefulsets/resources/zookeeper-headless-service.yaml && \
 kubectl apply -f kafka-statefulsets/resources/zookeeper-service.yaml && \
 kubectl apply -f kafka-statefulsets/resources/kafka.yaml && \
 kubectl apply -f kafka-statefulsets/resources/kafka-headless-service.yaml && \
 kubectl apply -f kafka-statefulsets/resources/kafka-service.yaml

You can then verify that the services started using
```
 kubectl describe all
```

Kafka in-memory

Kafka in-memory deployment is just for development and testing purposes and not for production. It is designed the same way as the Kafka StatefulSets deployment. The only difference is that for storing broker information (Zookeeper side) and topics/partitions (Kafka side), an emptyDir is used instead of Persistent Volume Claims. This means that its content is strictly related to the pod life cycle (deleted when the pod goes down). This makes the in-memory deployment well-suited to development and testing because you don't have to provide persistent volumes.

This deployment is available under the kafka-inmemory folder and provides following artifacts :

resources : provides all YAML configuration files for setting up services and deployments

Deploying to OpenShift

With the Openshift client tools installed and having logged into your cluster, create a pod using the provided template by running
```
 oc create -f kafka-inmemory/resources/openshift-template.yaml
```
in your terminal. This template provides the "zookeeper" and the "kafka" deployments and the "zookeeper-service" and "kafka-service" services.
Create a new app:
```
 oc new-app barnabas-inmemory
```

Deploying to Kubernetes

If you don't have a cluster running, start one (e.g. by executing minikube start)

Create the deployments and services by running:

 kubectl apply -f kafka-inmemory/resources/zookeeper.yaml && \
 kubectl apply -f kafka-inmemory/resources/zookeeper-service.yaml && \
 kubectl apply -f kafka-inmemory/resources/zookeeper-headless-service.yaml && \
 kubectl apply -f kafka-inmemory/resources/kafka.yaml && \
 kubectl apply -f kafka-inmemory/resources/kafka-service.yaml && \
 kubectl apply -f kafka-inmemory/resources/kafka-headless-service.yaml

You can then verify that the services started using
```
 kubectl describe all
```

Kafka Connect

This deployment adds a Kafka Connect cluster which can be used with either of the Kafka deployments described above. It is implemented as a deployment with a configurable number of workers. The default image currently contains only the Connectors distributed with Apache Kafka Connect - FileStreamSinkConnector and FileStreamSourceConnector. The REST interface for managing the Kafka Connect cluster is exposed internally within the Kubernetes / OpenShift cluster as service kafka-connect on port 8083.

Deploying to OpenShift

Deploy a Kafka broker to your OpenShift cluster using either of the in-memory or statefulsets deployments above.
Create a pod using the provided template by running
```
 oc create -f kafka-connect/resources/openshift-template.yaml
```
in your terminal.
Create a new app:
```
 oc new-app barnabas-connect
```

Deploying to Kubernetes

Deploy a Kafka broker to your Kubernetes cluster using either of the in-memory or statefulsets deployments above.

Start the deployment by running

 kubectl apply -f kafka-connect/resources/kafka-connect.yaml

in your terminal.

Start the Connect service by running

 kubectl apply -f kafka-connect/resources/kafka-connect-service.yaml

in your terminal.

Using Kafka Connect with additional plugins

Our Kafka Connect images contain by default only the FileStreamSinkConnector and FileStreamSourceConnector connectors. If you want to use other connectors, you can:

Mount a volume containing the plugins to path /opt/kafka/plugins/
Use the enmasseproject/kafka-connect image as Docker base image, add your connectors to the /opt/kafka/plugins/ directory and use this new image instead of enmasseproject/kafka-connect
Use OpenShift build system and our S2I image

Mount a volume containing the plugins

Prepare a PersistentVolume which contains a directory with your plugin(s) and their dependencies and ensure these files are world-readable (chmod -R a+r /path/to/your/directory).
Mount the volume into your Pod at the path /opt/kafka/plugins/

Create a new image based on `enmasseproject/kafka-connect`

Create a new Dockerfile which uses enmasseproject/kafka-connect

FROM enmasseproject/kafka-connect:latest
USER root:root
COPY ./my-plugin/ /opt/kafka/plugins/
USER kafka:kafka

Build the Docker image and upload it to your Docker repository
Use your new Docker image in your Kafka Connect deployment

Using Openshift Build and S2I image

OpenShift supports Builds which can be used together with Source-to-Image (S2I) framework to create new Docker images. OpenShift Build takes a builder image with the S2I support together with source code and/or binaries provided by the user and uses them to build a new Docker image. The newly created Docker Image will be stored in OpenShift's local Docker repository and can be used in deployments. The Barnabas project provides a Kafka Connect S2I builder image enmasseproject/kafka-connect-s2i which takes user-provided binaries (with plugins and connectors) and creates a new Kafka Connect image. This enhanced Kafka Connect image can be used with our Kafka Connect deployment.

To configure the OpenShift Build and create a new Kafka Connect image, follow these steps:

Create OpenShift build configuration using our OpenShift template

oc apply -f kafka-connect/s2i/resources/openshift-template.yaml
oc new-app barnabas-connect-s2i

Prepare a directory with Kafka Connect plugins which you want to use. For example:

$ tree ./my-plugins/
./my-plugins/
└── kafka-connect-jdbc
    ├── kafka-connect-jdbc-3.3.0.jar
    ├── postgresql-9.4-1206-jdbc41.jar
    └── sqlite-jdbc-3.8.11.2.jar

Start new image build using the prepared directory

oc start-build kafka-connect --from-dir ./my-plugins/

You can find out the address of your local Docker repository using oc get is kafka-connect. So to use your new image in the Kafka Connect deployment

oc create -f kafka-connect/resources/openshift-template.yaml
oc new-app -p IMAGE_REPO_NAME=$(oc get is kafka-connect -o=jsonpath={.status.dockerImageRepository} | sed 's/\(.*\)\/.*/\1/') barnabas-connect

PeterVanco / barnabas

Kafka as a Service

Kafka Stateful Sets

Deploying to OpenShift

Deploying to Kubernetes

Kafka in-memory

Deploying to OpenShift

Deploying to Kubernetes

Kafka Connect

Deploying to OpenShift

Deploying to Kubernetes

Using Kafka Connect with additional plugins

Mount a volume containing the plugins

Create a new image based on `enmasseproject/kafka-connect`

Using Openshift Build and S2I image

About

Languages

Kafka as a Service

Kafka Stateful Sets

Deploying to OpenShift

Deploying to Kubernetes

Kafka in-memory

Deploying to OpenShift

Deploying to Kubernetes

Kafka Connect

Deploying to OpenShift

Deploying to Kubernetes

Using Kafka Connect with additional plugins

Mount a volume containing the plugins

Create a new image based on enmasseproject/kafka-connect

Using Openshift Build and S2I image

About

Languages

Create a new image based on `enmasseproject/kafka-connect`