[BUG] Race condition in reading scale-up annnotation

Question

[BUG] Race condition in reading scale-up annnotation

ishan16696 opened this issue a year ago · comments

Describe the bug:
When cluster is marked for scale-up, etcd-druid adds an annotation ScaledToMultiNodeAnnotationKey in etcd statefulset to indicate it is scale-up scenario and after scale-up becomes successful, etcd-druid might take sometime in removing this annotation but at the sametime etcd-0 0th pod will get restart and it has been observed that sometime it reads this scale-up ScaledToMultiNodeAnnotationKey annotation before etcd-druid removes it.
There is race-condition happening between scale-up annotation being removed by druid and pod restarted just after scale-up.

Expected behavior:
It shouldn’t detected the false positive for scale-up scenario.

How To Reproduce (as minimally and precisely as possible):

Logs:

Anything else we need to know?:
Although it is not a big issue as it will only leads to single-member restoration without any downtime and that's why druid e2e tests are also passing without any issue.