[BUG] Race condition in reading scale-up annnotation
ishan16696 opened this issue · comments
Describe the bug:
When cluster is marked for scale-up, etcd-druid adds an annotation ScaledToMultiNodeAnnotationKey
in etcd statefulset to indicate it is scale-up scenario and after scale-up becomes successful, etcd-druid might take sometime in removing this annotation but at the sametime etcd-0
0th pod will get restart and it has been observed that sometime it reads this scale-up ScaledToMultiNodeAnnotationKey
annotation before etcd-druid removes it.
There is race-condition happening between scale-up annotation being removed by druid and pod restarted just after scale-up.
Expected behavior:
It shouldn’t detected the false positive for scale-up scenario.
How To Reproduce (as minimally and precisely as possible):
Logs:
Anything else we need to know?:
Although it is not a big issue as it will only leads to single-member restoration without any downtime and that's why druid e2e tests are also passing without any issue.