Hungsiro506 / spark-kafka-connector

Reliable spark-kafka connector. Kafka 1.x and Spark 2.x

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spark - Kafka Connector


  • Caching producer in executor and share with all JVM tasks
  • Shutdown hook to close producer when Spark executor is shutdown
  • Generic type for Kafka payload
  • Async sending msg to Kafka from SparkStreaming (Reciever and non-reciever)


If you want to save an RDD to Kafka

import com.hungsiro.spark_kafka.core.sink._
import com.hungsiro.spark_kafka.core.source._
import org.apache.kafka.common.serialization.StringSerializer

val topic = "my-topic"
val producerConfig: Map[String,Object] = loadConfig().producerConfig

val rdd: RDD[String] = ...
  s => new ProducerRecord[String, String](topic, s)
  //  s => new ProducerRecord[Array[Byte],Array[Byte]](config.outputTopic,s.key.toString.getBytes(),s.value.toString.getBytes())

If you want to save a DStream to Kafka

import com.hungsiro.spark_kafka.core.sink._
import com.hungsiro.spark_kafka.core.source._
val topic = "my-topic"
val producerConfig: Map[String,Object] = loadConfig().producerConfig

val dStream: DStream[String] = ...
  s => new ProducerRecord[String, String](topic, s)
  // //  s => new ProducerRecord[Array[Byte],Array[Byte]](config.outputTopic,s.key.toString.getBytes(),s.value.toString.getBytes())

Example :

Start ZooKeeper server:

./bin/ config/

Start Kafka server:

./bin/ config/

Create input topic:

./bin/ --create --zookeeper localhost:2182 --replication-factor 1 --partitions 1 --topic input

Create output topic:

./bin/ --create --zookeeper localhost:2182 --replication-factor 1 --partitions 1 --topic output

Start Kafka producer:

./bin/ --broker-list localhost:9092 --topic input

Start Kafka consumer:

./bin/ --zookeeper localhost:2182 --topic output

Run example application and publish a few words on input topic using Kafka console producer and check the processing result on output topic using Kafka console producer.

./bin/ --zookeeper localhost:2182 --list

./bin/ --create --zookeeper localhost:2182 --replication-factor 1 --partitions 1 --topic radiusConLog


Reliable spark-kafka connector. Kafka 1.x and Spark 2.x


Language:Scala 98.6%Language:Shell 1.4%