pipelinedb / pipeline_kafka

PipelineDB extension for Kafka support

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Consumer does not automatically reconnect to Kafka

analytik opened this issue · comments

When using a 1-node cluster, when shutting down Kafka and bringing it back up, pipeline_kafka consumers don't reconnect automatically, and don't consume any new messages:

2019-04-10 13:38:47.600 UTC,,,39,,5cadd114.27,9,,2019-04-10 11:18:44 UTC,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,""
2019-04-10 13:38:48.856 UTC,,,39,,5cadd114.27,10,,2019-04-10 11:18:44 UTC,,0,LOG,00000,"checkpoint complete: wrote 12 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=1.217 s, sync=0.007 s, total=1.255 s; sync files=12, longest=0.001 s, average=0.000 s; distance=15 kB, estimate=212 kB",,,,,,,,,""
2019-04-10 13:39:21.585 UTC,,,110,,5cadd422.6e,2,,2019-04-10 11:31:46 UTC,13/0,0,LOG,00000,"[pipeline_kafka] foobar_transactions_stream <- foobar.public.transactions (PID 110): librdkafka error: [thrd:kafka00:9092/bootstrap]: kafka00:9092/0: Disconnected (after 7654878ms in state UP)",,,,,,,,,""
2019-04-10 13:39:21.659 UTC,,,111,,5cadd422.6f,2,,2019-04-10 11:31:46 UTC,11/0,0,LOG,00000,"[pipeline_kafka] foobar_transactions_stream <- foobar.public.transactions (PID 111): librdkafka error: [thrd:kafka00:9092/bootstrap]: kafka00:9092/0: Disconnected (after 7654852ms in state UP)",,,,,,,,,""
2019-04-10 13:39:21.769 UTC,,,109,,5cadd422.6d,2,,2019-04-10 11:31:46 UTC,9/0,0,LOG,00000,"[pipeline_kafka] foobar_transactions_stream <- foobar.public.transactions (PID 109): librdkafka error: [thrd:kafka00:9092/bootstrap]: kafka00:9092/0: Disconnected (after 7654907ms in state UP)",,,,,,,,,""
2019-04-10 13:39:21.901 UTC,,,112,,5cadd422.70,3,,2019-04-10 11:31:46 UTC,14/0,0,LOG,00000,"[pipeline_kafka] foobar_transactions_stream <- foobar.public.transactions (PID 112): librdkafka error: [thrd:kafka00:9092/bootstrap]: kafka00:9092/0: Disconnected (after 7654893ms in state UP)",,,,,,,,,""
2019-04-10 13:41:33.403 UTC,,,110,,5cadd422.6e,3,,2019-04-10 11:31:46 UTC,13/0,0,LOG,00000,"[pipeline_kafka] foobar_transactions_stream <- foobar.public.transactions (PID 110): librdkafka error: [thrd:kafka00:9092/bootstrap]: kafka00:9092/0: Connect to ipv4#10.100.124.90:9092 failed: Connection timed out (after 131788ms in state CONNECT)",,,,,,,,,""
2019-04-10 13:41:33.433 UTC,,,111,,5cadd422.6f,3,,2019-04-10 11:31:46 UTC,11/0,0,LOG,00000,"[pipeline_kafka] foobar_transactions_stream <- foobar.public.transactions (PID 111): librdkafka error: [thrd:kafka00:9092/bootstrap]: kafka00:9092/0: Connect to ipv4#10.100.124.90:9092 failed: Connection timed out (after 131788ms in state CONNECT)",,,,,,,,,""
2019-04-10 13:41:33.597 UTC,,,109,,5cadd422.6d,3,,2019-04-10 11:31:46 UTC,9/0,0,LOG,00000,"[pipeline_kafka] foobar_transactions_stream <- foobar.public.transactions (PID 109): librdkafka error: [thrd:kafka00:9092/bootstrap]: kafka00:9092/0: Connect to ipv4#10.100.124.90:9092 failed: Connection timed out (after 131788ms in state CONNECT)",,,,,,,,,""
2019-04-10 13:41:33.727 UTC,,,112,,5cadd422.70,4,,2019-04-10 11:31:46 UTC,14/0,0,LOG,00000,"[pipeline_kafka] foobar_transactions_stream <- foobar.public.transactions (PID 112): librdkafka error: [thrd:kafka00:9092/bootstrap]: kafka00:9092/0: Connect to ipv4#10.100.124.90:9092 failed: Connection timed out (after 131789ms in state CONNECT)",,,,,,,,,""

What's worse, it seems that the last processed offset got corrupted somehow:

postgres=# SELECT * FROM pipeline_kafka.offsets WHERE consumer_id=3;
 consumer_id | partition | offset
-------------+-----------+--------
           3 |         2 |  15385
           3 |         3 |  15470
           3 |         4 |     -1
           3 |         0 |  15251
           3 |         1 |  15418
(5 rows)

Expected behavior: Consumers reconnect automatically without user's intervention, and starts processing new messages.