criteo / kafka-sharp

A C# Kafka driver

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Behaviour of ConsumerGroup like Kafka Consumer with "auto.offset.reset" option?

ArtemAstashkin opened this issue · comments

I have caught the following issue:

I have a topic with retention.ms = 1680000 (28 mins). Some service produces messages to this topic, but interval between messages can be more than 28 mins. If another service tries to consume the topic after deleting messages with current offset it will be stuck.

Even we subscribe to group with DefaultOffsetToReadFrom = Offset.Earliest or Offset.Lastest the result will be the same - current offset is set and greater than 0, because ConsumerGroup.Join (https://github.com/criteo/kafka-sharp/blob/master/kafka-sharp/kafka-sharp/Routing/ConsumerGroup.cs#L147) checks only that message greater than 0.

Example:
Service "A" started producing 5 messages: min offset will be 0, max offset: 5
Service "B" read the topic: min offset == max == 5. Service exit.
Service "A" produced 2 messages: current offset == min offset == 5, max == 7
after 30 mins previous 2 messages were deleted: current offset == 5, max == 7
Service "A" produced 3 messages: current == 5, max == 10, min == 7 because previous 2 were deleted
Service "B" started consuming, but stuck because message with offset 5 have already been deleted.

Is it possible to check and find the earliest or the latest offset depends on Offset.Earliest or Offset.Lastest option values like auto.offset.reset option in Kafka Consumer config (https://kafka.apache.org/documentation/)?

Thanks in advance.

Mmm, you're right, it's better to use the same behaviour as the JVM client in this case.

This should be corrected in the last update.