Doctor Kafka generated invalid reassignment
daywithoutrain opened this issue · comments
We adapted to use DoctorKafka in our production system recently. However, we found that the first time after using doctorkafka, a broker host having some issues and resulted in many under replicated partitions. doctorkafka tried to resolve under replicated partitions, but it somehow generated an reassignment including the following entries: (the entries are copied from doctor kafka's log)
{
"topic": "mydomain.MyTopic1",
"partition": 40,
"replicas": [
65633,
65633,
65633
]
},
{
"topic": "mydomain.MyTopic2",
"partition": 2,
"replicas": [
65633,
65633,
65633
]
},
...
This impacted our production system and delayed the downstream process. We have more than 50 broker hosts and 1 host was having trouble. I assume that the above assignment was generated due to a bug, is that correct? Have you seen this issue before? I'd appreciate if you could look into it ASAP. Thanks!
What is the replication factor for your topics? DoctorKafka has a known issue w.r.t. handling topics of replicationFactor = 1. If you set replication factor > 1, DoctorKafka should handle the topic assignment fine.
There might be a bug somewhere. Can you share the assignment plan by doctorkafka from $your_doctorkafka_site/servlet/actions, and the related doctorkafka logs?