O1ahmad / ansible-role-kafka

Ansible role that installs and configures Kafka: a distributed and fault tolerant stream-processing platform

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ansible logo

kafka logo

Ansible Role 📶 🌅 Kafka

Galaxy Role GitHub release (latest by date) License: MIT

Table of Contents

Ansible role that installs and configures Kafka, a distributed and fault tolerant stream-processing platform.

Supported Platforms:
* Debian
* Redhat(CentOS/Fedora)
* Ubuntu

Requirements

Requires the unzip/gtar utility to be installed on the target host. See ansible unarchive module notes for details.

Role Variables

Variables are available and organized according to the following software & machine provisioning stages:

  • install
  • config
  • launch
  • uninstall

Install

kafkacan be installed using compressed archives (.tar, .zip) downloaded and extracted from various sources.

The following variables can be customized to control various aspects of this installation process, ranging from software version and source location of binaries to the installation directory where they are stored:

kafka_user: <service-user-name> (default: kafka)

  • dedicated service user and group used by kafka for privilege separation (see here for details)

install_type: <string (*currently ONLY -archive- is supported)*> (default: archive)

  • archive: compatible with both tar and zip formats, archived installation binaries can be obtained from local and remote compressed archives either from the official download/releases site or those generated from development/custom sources.

kafka_home: </path/to/dir> (default: /opt/kafka)

  • path on target host where the kafka binaries should be extracted to and configuration rendered.

archive_url: <path-or-url-to-archive> (default: see defaults/main.yml)

  • address of a compressed tar or zip archive containing kafka binaries. This method technically supports installation of any available version of kafka. Links to official versions can be found here.

Config

Configuration of kafka is expressed within 3 files:

  • server.properties for configuring operational behavior of a Kafka broker
  • jvm.options for configuring Kafka JVM settings
  • log4j.properties for configuring Kafka logging

These files are located in a config directory, based on the location set for kafka_home.

For additional details and to get an idea how each config should look, reference Kafka's official configuration documentation.

The following variables can be customized to manage the location and content of these configuration files:

managed_configs: <list of configs to manage> (default: see defaults/main.yml)

  • list of configuration files to manage with this Ansible role

    Allowed values are any combination of:

    • server_properties
    • jvm_options
    • log4j_properties

server_properties: <hash-of-kafka-properties> default: {}

  • Any configuration setting/value key-pair supported by kafka broker configs should be expressible within each hash entry and properly rendered within the associated properties file. Note: Each <key> along with its <value> specifications should be written as expected to be rendered within the associated properties config (e.g. zookeeper.connect: zk1.cluster.net:2121 or advertised.listeners: PLAINTEXT://kafka1.cluster.net:9092).
Example
server_properties:
  broker.id: 10
  advertised.host.name: example-broker

A list of configurable settings can be found here.

jvm_options: <list-of-dicts> default: []

  • Kafka uses a set of environment variables to manage various aspects of its Java environment - see here for more details.

This role exposes management of these environment variables via a jvm.options configuration file located in the aforementioned configuration directory under kafka_home. The file consists of a line-delimited list of environment variable settings used to modify the behavior of Kafka's JVM.

While you should rarely need to change Java Virtual Machine (JVM) options; there are situations (e.g.insufficient heap size allocation) in which adjustments may be necessary. Each environment variable to be rendered in the file can be expressed as an entry in the jvm_options hash, with a list of dicts as values representing various supported flags/options.

Note: The hash keys representing the environment variables responsible for JVM management are not case-sensitive and can be expressed however feels comfortable to the operator.

Example
jvm_options:
  kafka_heap_opts:
    - "-Xmx1g -Xms1g"
  KAFKA_JVM_PERFORMANCE_OPTS:
    - -server
    - -XX:+UseConcMarkSweepGC
    - -Djava.awt.headless=true
  KAFKA_jmx_OPTS:
    - -Dcom.sun.management.jmxremote=true

Additional reference can be found here.

log4j_properties: <list-of-dicts> default: []

  • Kafka makes use of the Apache log4j logging system for organizing and managing each of its main and sub-component logging facilities. As such, individual settings can be applied on a global or per-component basis by defining configuration settings associated with various aspects of the logging process.

By default, log4j loads a log4j.properties file, underneath Kafka's main {{ kafka_home }}/config directory, which consists of line-delimited properties expressing a key-value pair representing a desired configuration.

Each line to be rendered in the file can be expressed as an entry within a list of dicts, contained within log4j_properties, consisting of a hash containing an optional comment field and list of associated key-value pairs encapsulated under a settings key root:

log4j2_properties:
  - comment: Set root logger list
    settings:
      - log4j.rootLogger: INFO,stdout,kafkaAppender
  - comment: Define stdout logger appender
    settings:
      - log4j.appender.stdout: org.apache.log4j.ConsoleAppender
      - log4j.appender.stdout.layout: org.apache.log4j.PatternLayout

See here for an example configuration file and list of supported settings.

Launch

Running a kafka broker is accomplished utilizing the systemd service management tool for archive installations. Launched as background processes or daemons subject to the configuration and execution potential provided by the underlying management framework, launch of kafka can be set to adhere to system administrative policies right for your environment and organization.

The following variables can be customized to manage the service's systemd service unit definition and execution profile/policy:

custom_unit_properties: <hash-of-systemd-service-settings> (default: [])

  • hash of settings used to customize the [Service] unit configuration and execution environment of the Kafka systemd service.
custom_unit_properties:
  LimitNOFILE: infinity

Reference the systemd.service man page for a configuration overview and reference.

Uninstall

Support for uninstalling and removing artifacts necessary for provisioning allows for users/operators to return a target host to its configured state prior to application of this role. This can be useful for recycling nodes and roles and perhaps providing more graceful/managed transitions between tooling upgrades.

The following variable(s) can be customized to manage this uninstall process:

perform_uninstall: <true | false> (default: false)

  • whether to uninstall and remove all artifacts and remnants of this kafka installation on a target host (see: handlers/main.yml for details)

Dependencies

  • 0x0i.systemd

Example Playbook

default example:

- hosts: all
  roles:
  - role: 0x0I.kafka

install specific version of Kafka binaries with pre-defined defaults:

- hosts: legacy-kafka-broker
  roles:
  - role: 0x0I.kafka
    vars:
        managed_configs: []
        install_type: archive
        archive_url: https://archive.apache.org/dist/kafka/1.0.0/kafka_2.12-1.0.0.tgz

adjust broker identification details:

- hosts: my-broker
  roles:
    - role: 0x0I.kafka
      vars:
        managed_configs: ['server_properties']
        server_properties:
          broker.id: 12
          advertised.host.name: my-broker.cluster.domain

launch Kafka brokers connecting to existing remote Zookeeper cluster and customize connection parameters:

- hosts: broker
  roles:
    - role: 0x0I.kafka
      vars:
        managed_configs: ['server_properties']
        server_properties:
          zookeeper.connect: 111.22.33.4:2181
          zookeeper.connection.timeout.ms: 30000
          zookeeper.max.in.flight.requests: 30

update Kafka commit log directory and parameters:

- hosts: broker
  roles:
    - role: 0x0I.kafka
      vars:
        managed_configs: ['server_properties']
        log_dirs: /mnt/data/kafka # can be provided in place of server property below
        server_properties:
          log.dirs: /mnt/data/kafka
          log.flush.interval.ms: 3000
          log.retention.hours: 168
          zookeeper.connect: zk1.cluster.domain:2181

adjust JVM settings for JMX metric collection and broker auditing:

- hosts: broker
  roles:
    - role: 0x0I.kafka
      vars:
        managed_configs: ['jvm_options']
        jvm_options:
          kafka_jmx_opts:
            - -Dcom.sun.management.jmxremote=true
            - -Dcom.sun.management.jmxremote.port=9999
            - -Dcom.sun.management.jmxremote.authenticate=false
            - -Dcom.sun.management.jmxremote.ssl=false
            - -Djava.net.preferIPv4Stack=true

increase log4j logging levels for troubleshooting/debugging:

- hosts: broker
  roles:
    - role: 0x0I.kafka
      vars:
        managed_configs: ['log4j_properties']
        log4j_properties:
          - comment: Set root logger level and log appenders
            settings:
              - log4j.rootLogger: DEBUG,stdout,kafkaAppender
          - comment: Define stdout logger appender
            settings:
              - log4j.appender.stdout: org.apache.log4j.ConsoleAppender
              - log4j.appender.stdout.layout: org.apache.log4j.PatternLayout
          - comment: Define kafka logger appender
            settings:
              - log4j.appender.kafkaAppender: org.apache.log4j.DailyRollingFileAppender
              - log4j.appender.kafkaAppender.DatePattern: "'.'yyyy-MM-dd-HH"
              - log4j.appender.kafkaAppender.File: "${kafka.logs.dir}/server.log"
              - log4j.appender.kafkaAppender.layout: org.apache.log4j.PatternLayout

License

MIT

Author Information

This role was created in 2020 by O1.IO.

About

Ansible role that installs and configures Kafka: a distributed and fault tolerant stream-processing platform


Languages

Language:Ruby 84.2%Language:Jinja 15.8%