joshuarobinson / logpipeline_confluent_elastic

Easily deploy a log analytics pipeline with Confluent and Elastic

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Log Analytics Pipeline: Pure + Confluent + Elastic

A helm chart to install a log analytics cluster in kubernetes and start a simple load-generator based on flog. Read more about the motivation for the helm chart in this blog post.

The log analytics pipeline leverages 1) Confluent Tiered Storage and 2) Elastic Searchable Snapshots, both built upon FlashBlade S3 object store.

  • Flog is a fake log generator for apache weblog-like output
  • Confluent operator for Kubernetes configured to use Confluent Tiered Storage on S3 to simplify operations at scale
  • Filebeats to pull data from Kafka to Elasticsearch
  • Elasticsearch with hot tier storage on FlashBlade NFS and Frozen Tier backed by an S3 snapshot repository
  • Prometheus and Grafana dashboards to monitor the FlashBlade and Elasticsearch.

As part of the setup process, this helm chart creates the necessary S3 accounts, users, keys, and buckets on the target FlashBlade using a separate program called s3manage. This program is a lightweight python wrapper to make using the REST API easier from a Kubernetes job.

Currently, the chart creates NodePort services to expose the Confluent Control Center (port :30921) and Kibana (port :30561) interfaces.

Further Reading:

Pre-requisites

Inputs

The required chart inputs in values.yaml are the FlashBlade IPs (management and data) and login token.

To obtain the TOKEN, login via CLI and either create or list the token:

pureadmin [create|list] --api-token --expose

By default, all PersistentVolumes use the FlashBlade (or the default storageclass), but this can be optionally changed to any other storageclass.

About

Easily deploy a log analytics pipeline with Confluent and Elastic


Languages

Language:Mustache 100.0%