salebab / phpkafka

PHP extension for Apache Kafka

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problems about offset

siyuanmami opened this issue · comments

@EVODelavega
Hi EVODelavega,
Can I ask some questions about offset which I am not quite sure about when I utilize phpkafka client:
https://github.com/EVODelavega/phpkafka

  1. Does phpkafka client store offset of each topic partition? I have a try on function getPartitionsForTopic in Kafka.class.php, but it does not work.
  2. For consuming message by invoking function consume($topic, $offset = self::OFFSET_BEGIN, $count = self::OFFSET_END), we must tell it the offset, right ?

For consuming messages, I store the offset of each topic partition in files, wrote a php script, which reads offset from the file, and then invokes function consume with the offset as parameter, after that, I will update the offset in the file by adding the message count which are consumed this time to the previous offset. Besides, I add a crontab in the server for executing that php script everything minute.

To be more concrete, if the offset in the file is 5, and 20 messages are consumed this time, the I will update the offset in that file to be 25. I update the offset each time messages are consumed because I do not want to consume same message repeatedly.

I am not sure if my solution of consumer is good or not, what I worry about is, for kafka broker, it will delete old log(segment) by some policy, I did not modify the default configuration of broker, so it will delete any log with a modification time of more than 7 days ago.

If the old logs are deleted by broker, not sure if the offset stored in the file will be incorrect.

Have you come across the same problem? Could you give me some suggestion about this ? Is there any better way to record the offset, and deleting old logs will not affect it.

Thanks!