[BUG] pub webhooks unexpectedly return error when PUB is NOT FOUND
Spground opened this issue · comments
What happened:
PUB may unexpectedly interrupt Pod gc issued by KCM, which can lead Pod leak if KCM gc did not retry or retry many hours later.
What you expected to happen:
PUB webhooks never interrupt Pod gc.
How to reproduce it (as minimally and precisely as possible):
Delete workload let's say Sts or CloneSet, then Pod will be deleted by KCM gc later. Sometimes, Pod to delete will be leaking there for a lone time.
Anything else we need to know?:
The root cause is we return error when PUB CR is deleted in RetryOnConflict. Related codes is here,
The solution is simple, just check error type as we can , ignore it if it is NotFound error.
Environment:
- Kruise version:
- Kubernetes version (use
kubectl version
): - Install details (e.g. helm install args):
- Others: