dashbitco / broadway_kafka

A Broadway connector for Kafka

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

exactly once delivery (EOD) in stream processing

matreyes opened this issue · comments

Hi!

For stream processing Kafka has a transaction API that exposes a "transaction coordinator" which commits the offset to the source topic and commits the messages sent to the target topic allowing exactly once semantics. an explanation here

Neither Brod or KafkaEx exposes the transactional api, but their protocols (kafka_protocol and kayrock) does.

I don't know if you would like broadway to implement this end-to-end integration. If we are doing something transactional, maybe we should use just one process per partition, each one with a transactional_id for commiting the transaction (we wouldn't need back pressure), and that's it!. But it would bee nice to have a single Broadway API to handle all types of streaming use cases.

What do you think? how would you handle exactly once delivery?

Hi @matreyes! For us to support it on Broadway, it would need to be part of Broadway or then a separate Broadway package. I think usage within Broadway is relatively straight-forward indeed given we already support partitioning and support acking at any time. My only recommendation would be to make it so the Broadway.Message itself has all of the transaction items, as a way to simplify the acking.

Closing this as we don't plan to tackle it for now. Thanks!

I will try this in the near future using kafka_protocol or kayrock directly (nor brod or kafka_ex supports it). If i’m lucky I will reach you with a proposal.
Best! And thanks for everything ❤️!