wubx / ktobend

kafka data to databend

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ktobend

Consume kafka data and store it in databend.

Config

make sure you have a config.properties file in the resources dir with the following properties:

# config.properties
kafka.bootstrap.servers=localhost:9092
kafka.consumer.group.id=1
kafka.json.topic=test_kafka
kafka.file.topic=orders
output.directory=/tmp

databend.dsn=jdbc:databend://tn3ftqihs--medium-p8at.gw.aws-us-east-2.default.databend.com:443?ssl=true
databend.user=cloudapp
databend.password=databend
databend.tmpTable=tbcc
databend.batch.size=1
databend.targetTable=tb_t
databend.interval=5

Usage

How to build

Need Jdk 1.8 or higher version, and maven 3.6.3 or higher version. First you should have maven installed, then you can run the following command to build the project: Before you build the project, you should have a databend server running, and write your config in src/main/resources/config.properties

mvn clean package

How to run

java -jar target/ktobend-1.0-SNAPSHOT-jar-with-dependencies.jar

Then you can see:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
2 ConsumerJsonWorker and ConsumerStageFileWorker started!
.......

How to test

  1. create two table in databend, tmp table and target table

NOTE: make sure id, batch, t fields are present in the table

    CREATE TABLE tb_t (
    			id Int64,
                batch String,
    			u64 UInt64,
    			f64 Float64,
    			s   String,
    			s2  String,
    			a16 Array(Int16),
    			a8  Array(UInt8),
    			d   Date,
    			t   DateTime);
    CREATE TABLE tbcc (
    			id Int64,
                batch String,
    			u64 UInt64,
    			f64 Float64,
    			s   String,
    			s2  String,
    			a16 Array(Int16),
    			a8  Array(UInt8),
    			d   Date,
    			t   DateTime);
  1. run zookeeper
zookeeper-server-start /opt/homebrew/etc/kafka/zookeeper.properties
  1. run kafka
kafka-server-start /opt/homebrew/etc/kafka/server.properties
  1. create in json topic
kafka-topics --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test_kafka
  1. create file info topic
kafka-topics --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic orders
  1. list topics:
kafka-topics --list --bootstrap-server localhost:9092
  1. produce json data to topic

Json data is:

{"tableName":"tbcc","batch":"2024-03-14-1", "value":{"id":10, "batch":"2024-03-14-1","u64": 30,"f64": 21,"s": "hao","s2": "hello","a16":[1],"a8":[2],"d": "2011-03-06","t": "2016-04-04 12:30:00"}}

Value is Array:

{"tableName":"tbcc","batch":"2024-03-14-1", "value":[{"id":10, "batch":"2024-03-14-1","u64": 30,"f64": 22,"s": "hao","s2": "hello","a16":[1],"a8":[2],"d": "2011-03-06","t": "2016-04-04 14:30:00"},{"id":10, "batch":"2024-03-14-1","u64": 30,"f64": 21,"s": "hao","s2": "hello","a16":[1],"a8":[2],"d": "2011-03-06","t": "2016-04-04 12:30:00"}]}
kafka-console-producer --bootstrap-server localhost:9092 --topic test-kafka
  1. Write json record will generate a file and upload to stage

1

  1. The json data will be copied into target table tbcc

result

About

kafka data to databend

License:Apache License 2.0


Languages

Language:Java 100.0%