apache / pulsar

Apache Pulsar - distributed pub-sub messaging system

Home Page:https://pulsar.apache.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] broker healthcheck ran into loop after decommissioned a cluster of bookies

wallacepeng opened this issue · comments

Search before asking

  • I searched in the issues and found nothing similar.

Read release policy

  • I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

Version

puslar 2.10.5

Minimal reproduce step

  1. setup two bookkeeper clusters using helm charts
    bookkeeper and bookkeeper1
  2. make bookkeeper as readonly
  3. decommission bookkeeper till zero replica (as we are using kubernetes, scale down one node, autorecovery replicates the ledgers)
  4. restart brokers.
  5. broker ran into loop on health check

What did you expect to see?

broker health check should continue to work

What did you see instead?

broker health check ran into loop

image

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
commented

since 2.10 is not supported anymore, can you plz check if this also appears in newer versions?
For details see

@hpvd we are downgrading the storage so provisioned the bookkeeper cluster. we will upgrade the cluster a bit later. is there any way to clean the healthcheck topic it looks like the ledger cached old bookies ?

I finally fixed it . I have to set up another broker cluster, then did some clean up for namespace and managed-ledgers , schemas , then restore the old broker cluster, it fixed the issue .