isopropylcyanide / SpringBatch-KafkaDB

A small demo that leverages Spring batch's capabilities to do job processing and Apache Kafka's stream processing. A simple CSV file gets used up in a batch job which then writes it to a Kafka queue (and H2 database) for further processing.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SpringBatch-KafkaDB Demo


What is it?

A small demo that leverages Spring batch's capabilities to do job processing and Apache Kafka's stream processing. A simple CSV file is used in a batch job which then writes it to a Kafka Producer for further processing. A Kafka consumer can then verify by consuming the messages from the correct topic.

image


Spring Batch pipeline

Below mentioned pipeline has been followed through out the codebase. The implementation is trivial once you modularize the responsibilities of each relevant class.

image


Prerequisites

  • Spring Boot + Batch + JPA
  • Apache Kafka
  • Apache Zookeeper

Expectation

Batch systems offer tremendous advantages as compared to interactive systems.

  • Repeated jobs are done fast in batch systems without user interaction.
  • You don’t need special hardware and system support to input data in batch systems.
  • Best for large organizations but small organizations can also benefit from it.

Expectation is to convert the following flat file into something meaningful when run as a batch process.

image

  • Such as a Kafka stream like this *

image

  • Or to a datastore like this *

image


Setting up Apache Kafka

  # Start Zookeeper instance 
  $ zookeeper-server-start.bat ..\..\config\zookeeper.properties
  
  # Start Kafka server
  $ kafka-server-start.bat ..\..\config\server.properties
  
  # Create a topic
  $ kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic CSV_TOPIC_K
  

Make sure following is appended to config\server.properties

port = 9092
advertised.host.name = localhost 

What are the list of branches

Branch Description
master Base branch that reads from CSV and processes them to a topic in a Kafka producer
batch-db-upload Similar to master except that it deserializes the CSV to a H2 Database instead of Kafka

About

A small demo that leverages Spring batch's capabilities to do job processing and Apache Kafka's stream processing. A simple CSV file gets used up in a batch job which then writes it to a Kafka queue (and H2 database) for further processing.


Languages

Language:Java 100.0%