`cloud-provider-config` Secret is not updated on Shoot deletion -> deadlock on Shoot deletion
ialidzhikov opened this issue · comments
How to categorize this issue?
/area control-plane
/kind bug
/platform azure
What happened:
The cloud-provider-config
Secret holds the Azure credentials for cloud-controller-manager. Currently this Secret is updated/created only on ControlPlane reconciliation.
There is the following deadlock situation for a deletion of hibernated Shoot.
-
Shoot with invalid credentials gets deleted.
-
As the Shoot is hibernated, the deletions fails to destroy the ControlPlane with reason:
task "Waiting until shoot control plane has been destroyed" failed: Failed to delete ControlPlane shoot--foo--test/test: Error deleting ControlPlane: error while waiting for managed resource containing shoot chart for controlplane 'shoot--foo--test/test' to be deleted: error while waiting for all resources to be deleted: retry failed with context deadline exceeded, last error: resource shoot--foo--test/extension-controlplane-shoot still exists: Could not clean all old resources: 2 errors occurred: [deletion of old resource "v1/Service/kube-system/allow-tcp-egress" is still pending, deletion of old resource "v1/Service/kube-system/allow-udp-egress" is still pending]
CCM is
CrashLoopBackOff
due to invalid credentials, hence cannot deleted theallow-tcp-egress
andallow-udp-egress
Services. -
Shoot owner updates the credentials with valid ones.
-
The deletion continues to fail with the error from step 2.
The
cloud-provider-config
Secret never gets updated.
What you expected to happen:
Deletion of hibernated Shoot to succeed once the credentials are updated with valid ones.
How to reproduce it (as minimally and precisely as possible):
See above.
Anything else we need to know?:
N/A
Environment:
- Gardener version (if relevant): v1.32.0
- Extension version:
- Kubernetes version (use
kubectl version
): - Cloud provider or hardware configuration:
- Others: