ksmin23 / streaming-data-pipeline-from-kafka-to-s3-using-aws-kinesis-firehose

Streaming data pipeline to continuously load data from an Amazon MSK or MSK Serverless cluster to Amazon S3 using Amazon Kinesis Data Firehose.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Managed Data Delivery from Apache Kafka to Amazon S3 using Kinesis Data Firehose

This repository contains a set of example projects to continuously load data from an Amazon Managed Streaming for Apache Kafka (Amazon MSK) to Amazon Simple Storage Service (Amazon S3). We use Amazon Kinesis Data Firehose—an extract, transform, and load (ETL) service—to read data from a Kafka topic, transform the records, and write them to an Amazon S3 destination.

Example Architecture
msk-firehose-s3-stack msk-firehose-s3-arch
msk-serverless-firehose-s3-stack msk-serverless-firehose-s3-arch

Enjoy!

References

About

Streaming data pipeline to continuously load data from an Amazon MSK or MSK Serverless cluster to Amazon S3 using Amazon Kinesis Data Firehose.


Languages

Language:Python 98.3%Language:Batchfile 1.7%