[Bug] broker healthcheck ran into loop after decommissioned a cluster of bookies
wallacepeng opened this issue · comments
Search before asking
- I searched in the issues and found nothing similar.
Read release policy
- I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.
Version
puslar 2.10.5
Minimal reproduce step
- setup two bookkeeper clusters using helm charts
bookkeeper and bookkeeper1 - make bookkeeper as readonly
- decommission bookkeeper till zero replica (as we are using kubernetes, scale down one node, autorecovery replicates the ledgers)
- restart brokers.
- broker ran into loop on health check
What did you expect to see?
broker health check should continue to work
What did you see instead?
broker health check ran into loop
Anything else?
No response
Are you willing to submit a PR?
- I'm willing to submit a PR!
since 2.10 is not supported anymore, can you plz check if this also appears in newer versions?
For details see
- supported versions: https://pulsar.apache.org/contribute/release-policy/
- latest pulsar helm chart: https://github.com/apache/pulsar-helm-chart/releases
@hpvd we are downgrading the storage so provisioned the bookkeeper cluster. we will upgrade the cluster a bit later. is there any way to clean the healthcheck topic it looks like the ledger cached old bookies ?
I finally fixed it . I have to set up another broker cluster, then did some clean up for namespace and managed-ledgers , schemas , then restore the old broker cluster, it fixed the issue .