Log Analytics Pipeline: Pure + Confluent + Elastic
A helm chart to install a log analytics cluster in kubernetes and start a simple load-generator based on flog. Read more about the motivation for the helm chart in this blog post.
The log analytics pipeline leverages 1) Confluent Tiered Storage and 2) Elastic Searchable Snapshots, both built upon FlashBlade S3 object store.
- Flog is a fake log generator for apache weblog-like output
- Confluent operator for Kubernetes configured to use Confluent Tiered Storage on S3 to simplify operations at scale
- Filebeats to pull data from Kafka to Elasticsearch
- Elasticsearch with hot tier storage on FlashBlade NFS and Frozen Tier backed by an S3 snapshot repository
- Prometheus and Grafana dashboards to monitor the FlashBlade and Elasticsearch.
As part of the setup process, this helm chart creates the necessary S3 accounts, users, keys, and buckets on the target FlashBlade using a separate program called s3manage. This program is a lightweight python wrapper to make using the REST API easier from a Kubernetes job.
Currently, the chart creates NodePort services to expose the Confluent Control Center (port :30921) and Kibana (port :30561) interfaces.
Further Reading:
Pre-requisites
- Kubernetes cluster installed
- PortWorx installed and configured correctly.
- Confluent for Kubernetes installed.
- Elastic Cloud for Kubernetes installed.
- Elastic license enabled or trial license installed.
Inputs
The required chart inputs in values.yaml are the FlashBlade IPs (management and data) and login token.
To obtain the TOKEN, login via CLI and either create or list the token:
pureadmin [create|list] --api-token --expose
By default, all PersistentVolumes use the FlashBlade (or the default storageclass), but this can be optionally changed to any other storageclass.