guanlisheng / presto-event-stream

Stream events from presto to a kafka topic

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

presto-event-stream

A Presto plugin to stream trino events into a Kafka topic.

inspired by

Install

Run mvn clean package to build this plugin, then put the plugin file PrestoEventStream-1.0.zip to the plugin folder of Presto coordinator.

Configuration

Create new properties file event-listener.properties inside the /etc/ directory:

event-listener.name=event-stream
bootstrap.servers=broker:9092
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=org.apache.kafka.common.serialization.StringSerializer

recommend adding the following row in your etc/catalog/hive.properties

hive.verbose-runtime-stats-enabled=true

Avro formatter is added to serialize messages generated from QueryCreatedEvent, QueryCompletedEvent. Avro formatted messages would be read as String using the StringSerializer, then it will emit events to the Kafka topic presto.event.

Post-event analysis with Presto

We would use Hudi/Deltastramer to sink the kafka topic easily

--schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider
--source-class org.apache.hudi.utilities.sources.JsonKafkaSource
--hoodie-conf bootstrap.servers=broker:9092
--hoodie-conf hoodie.deltastreamer.schemaprovider.source.schema.file=QueryCompletedEvent.avsc

Overall Arch

Art of Schema

About

Stream events from presto to a kafka topic


Languages

Language:Java 100.0%