Undesired partition migration happening because of stale metadata
emasab opened this issue · comments
Description
A partition migration can happen, using stale metadata, when the partition is validating a next_fetch_start
and retring the validation.
It can use stale metadata with an invalid leader epoch and migrate back to the previous leader. Later this is corrected but it's sub-optimal.
How to reproduce
Execute test 0146/do_test_stale_metadata_doesnt_migrate_partition
in #4680.
Checklist
Please provide the following information:
- librdkafka version (2.1.0)
- Apache Kafka version:
<REPLACE with e.g., 0.10.2.3>
- librdkafka client configuration:
<REPLACE with e.g., message.timeout.ms=123, auto.reset.offset=earliest, ..>
- Operating system:
<REPLACE with e.g., Centos 5 (x64)>
- Provide logs (with
debug=..
as necessary) from librdkafka - Provide broker log excerpts
- Critical issue