Document consumption modes
bartelink opened this issue · comments
Ruben Bartelink commented
When consuming via Kafka, the high level modes include:
Representation | Description | Implemented In | Notes |
---|---|---|---|
Serial Messages | Messages are individual events tagged with a key. No parallelism is or can be employed. | Some current usage uses this pattern but should be deprecated | |
Batched | Consume, processing batches for a given partition together. Checkpoint when batch completed | Jet.ConfluentKafka.FSharp | |
Stream Event Spans | Messages are spans of events from a specified Stream Name (also the Key), with a specified index |
Propulsion.Kafka | Implemented in StreamedConsumer and proProjector + proConsumer template. Allows production in parallel without max.in.flight=1. Consumer deduplicates |
Stream Summary Events | Messages contain a rendition of a summary as at a particular version. Version is monotonic and consumer is free to / should only use the most recent one | TODO build a dotnet-templates example and provide default wiring in StreamedConsumer. Consumer can deduplicate by taking highest version (probably tracking last-seen per stream to deduplicate). Ref http://verraes.net/2019/05/patterns-for-decoupling-distsys-summary-event/ | |
Ordered Summaries | Messages represent state at a point in time. The topic's ordering within a key defines the 'version'. Consumer needs to group by key , dropping all but the newest. There's no way to de-duplicate double sends. |
Stream Summary Events are preferred, but if the producer can guarantee the ordering when producing, it's a perfectly reasonable representation. |
Surely there's a glossary for this sort of thing somewhere? please comment!
Ruben Bartelink commented
Broader information in #200