dibbhatt / kafka-spark-consumer

High Performance Kafka Connector for Spark Streaming.Supports Multi Topic Fetch, Kafka Security. Reliable offset management in Zookeeper. No Data-loss. No dependency on HDFS and WAL. In-built PID rate controller. Support Message Handler . Offset Lag checker.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Static zookeeper hosts configuration

muuki88 opened this issue · comments

The current configuration for zookeeper.hosts doesn't allow url paths to connect to different zookeepers.

My zookeeper instance is running under myhost:2181/kafka

Map(
  "zookeeper.hosts" -> "myhost",
  "zookeeper.port" -> "2181"
)

will connect to the wrong zookeeper. The easiest way arround this would be a zookeeper.connections setting.

Map(
  "zookeeper.connections" -> "myhost:2181/kafka"
)

which we can prefere over the host/port combination. WDYT?

Is the /kafka is the ZK Path where broker details is stored in your Kafka setup ?

There are two ZK setting there in this Consumer .

  1. For Connecting to Kafka

    props.put("zookeeper.hosts", "x.x.x.x");
    props.put("zookeeper.port", "2181");
    props.put("zookeeper.broker.path", "/brokers");

  2. For Storing Consumed Offset

    props.put("zookeeper.consumer.connection", "x.x.x.x:2181");
    props.put("zookeeper.consumer.path", "/consumer-path");

If you use the Host/Port configuration in both , and specify the "zookeeper.broker.path" for your Kafka broker path in ZK (default is /brokers), and ""zookeeper.consumer.path" for the path where you want your Consumer offset to be stored, this should suffice ..

Am I missing anything here ?

Dibyendu

I'm sorry. I confused the setup in our company, assuming that we actually had two different zookeeper instances. The settings provided by your library are sufficient.