pinterest / DoctorK

DoctorK is a service for Kafka cluster auto healing and workload balancing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Doctor Kafka generated invalid reassignment

daywithoutrain opened this issue · comments

We adapted to use DoctorKafka in our production system recently. However, we found that the first time after using doctorkafka, a broker host having some issues and resulted in many under replicated partitions. doctorkafka tried to resolve under replicated partitions, but it somehow generated an reassignment including the following entries: (the entries are copied from doctor kafka's log)

 {
            "topic": "mydomain.MyTopic1",
            "partition": 40,
            "replicas": [
                65633,
                65633,
                65633
            ]
        },
{
            "topic": "mydomain.MyTopic2",
            "partition": 2,
            "replicas": [
                65633,
                65633,
                65633
            ]
        },

...
This impacted our production system and delayed the downstream process. We have more than 50 broker hosts and 1 host was having trouble. I assume that the above assignment was generated due to a bug, is that correct? Have you seen this issue before? I'd appreciate if you could look into it ASAP. Thanks!

What is the replication factor for your topics? DoctorKafka has a known issue w.r.t. handling topics of replicationFactor = 1. If you set replication factor > 1, DoctorKafka should handle the topic assignment fine.

There might be a bug somewhere. Can you share the assignment plan by doctorkafka from $your_doctorkafka_site/servlet/actions, and the related doctorkafka logs?