Follow up work of #12413 and related issues, ensure pod finalizer get removed
imliuda opened this issue · comments
Currently, a feature is introduced to solve 'pod deleted' issue, whichi is caused by kube-controller-manager garbage controller. But for now, there are risks that the finalizer cann't be removed (or not in time) in some cases, for example:
- manually delete workflows while controller is not running
- workflow-controller restart due to health check
- api rate-limiting
Here I have some suggestion to avoid that situation occurrs.
- use workflow finalizer
- use a cron to remove finalizer of finished pod periodicly
But both methods can't handle api rate-limiting, may an api priority mechanism may be introduced. Also, there may some other cases and solutions.
And, there may be other option, but not using finalizer. As we know, this caused by gc controller, we may make some change to let it sort by finished time. Or, we can let wait container not exist, but sleep for a while, once workflow-controller have captured the exit status of main container, we kill wait container.