Log Analytics Pipeline: Pure + Confluent + Elastic

A helm chart to install a log analytics cluster in kubernetes and start a simple load-generator based on flog. Read more about the motivation for the helm chart in this blog post.

The log analytics pipeline leverages 1) Confluent Tiered Storage and 2) Elastic Searchable Snapshots, both built upon FlashBlade S3 object store.

Flog is a fake log generator for apache weblog-like output
Confluent operator for Kubernetes configured to use Confluent Tiered Storage on S3 to simplify operations at scale
Filebeats to pull data from Kafka to Elasticsearch
Elasticsearch with hot tier storage on FlashBlade NFS and Frozen Tier backed by an S3 snapshot repository
Prometheus and Grafana dashboards to monitor the FlashBlade and Elasticsearch.

As part of the setup process, this helm chart creates the necessary S3 accounts, users, keys, and buckets on the target FlashBlade using a separate program called s3manage. This program is a lightweight python wrapper to make using the REST API easier from a Kubernetes job.

Currently, the chart creates NodePort services to expose the Confluent Control Center (port :30921) and Kibana (port :30561) interfaces.

Pre-requisites

Kubernetes cluster installed
PortWorx installed and configured correctly.
Confluent for Kubernetes installed.
Elastic Cloud for Kubernetes installed.
Elastic license enabled or trial license installed.

Inputs

The required chart inputs in values.yaml are the FlashBlade IPs (management and data) and login token.

To obtain the TOKEN, login via CLI and either create or list the token:

pureadmin [create|list] --api-token --expose

By default, all PersistentVolumes use the FlashBlade (or the default storageclass), but this can be optionally changed to any other storageclass.

About

Easily deploy a log analytics pipeline with Confluent and Elastic

Languages

Language:Mustache 100.0%