SAP / sap-btp-service-operator

SAP BTP service operator enables developers to connect Kubernetes clusters to SAP BTP accounts and to consume SAP BTP services within the clusters by using Kubernetes native tools.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Service instances and bindings leak after owner is deleted

georgethebeatle opened this issue · comments

What we are trying to do

We are building an operator that reconciles our own CRD into a service instance with a binding. In our contollers we set ownership from our object (the owner) to both the service instance and the service binding objects. This way we should be able to delete the owner and get rid of everything with a single operation.

Expected behaviour

When we delete the owner we expect that both the service instance and its binding will go away

Actual behaviour

After deleting the owner:

  • Both the service instance and its binding are still around
  • The service binding is still created and ready
  • The service binding has neither a deletion timestamp nor a finalizer, but it has an owner reference indicating that it is owned by the service instance
  • The service instance is in DeletionFailed state, because it still has one binding
  • The service instance has its deletion timestamp set and a services.cloud.sap.com/sap-btp-finalizer finalizer

It looks like the controller reference from the service instance to its binding prevents us from ever dereferencing the service binding, so it is never attempted to be deleted. The only workaround would be to manually delete both objects from a finalizer in our controller instead of simply setting ownership, which adds unneeded complexity to other controllers using the BTP operator.

Additional Details

  • We are using BTP operator v0.5.3
  • Here is a script that reproduces this behaviour:
#!/bin/bash

set -euo pipefail

kubectl apply -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: test
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: test
  name: owner
EOF

OWNER_UID="$(kubectl get -n test configmaps owner -o=jsonpath='{.metadata.uid}')"

kubectl apply -f - <<EOF
apiVersion: services.cloud.sap.com/v1
kind: ServiceInstance
metadata:
  namespace: test
  name: my-service-instance
  ownerReferences:
  - apiVersion: v1
    kind: ConfigMap
    name: owner
    uid: $OWNER_UID
spec:
  serviceOfferingName: xsuaa
  servicePlanName: application
---
apiVersion: services.cloud.sap.com/v1
kind: ServiceBinding
metadata:
  namespace: test
  name: my-service-binding
  ownerReferences:
  - apiVersion: v1
    kind: ConfigMap
    name: owner
    uid: $OWNER_UID
spec:
  secretName: my-secret
  serviceInstanceName: my-service-instance
EOF

kubectl wait --namespace test serviceinstances.services.cloud.sap.com/my-service-instance --for=condition=ready
kubectl wait --namespace test servicebindings.services.cloud.sap.com/my-service-binding --for=condition=ready

kubectl delete --namespace test configmaps owner

Service-Manager blocks the deletion of an instance if has bindings, it is not related to the operator. since the instance is not deleted the binding does not get a deletion timestamp.
The order of cleanup should be to delete the binding first and only once all bindings are deleted the instance can be deleted.

Hey @kerenlahav,

This is our observation as well.

However, in the kubernetes world, custom resources are supposed to work in declarative, rather than imperative manner. What that means is that kubectl users should be able to delete resources in random order, and eventually the system should reconcile itself into the desired state.

To give an example, imagine you have a namespace and a pod in it. It would be weird not to be able to delete the namespace before deleting the pod.

Our prototype has implemented the required deletion order via implementing a finalizer, however this is inconvenient as every btp operator user should do the same. Instead, we believe that such logic should be implemented in the operator itself.

you can use foreground deletion, (by default the deletion is background), see k8s documentation
if you will add the foreground flag the bindings will be marked for deletion as well (before the instance itself is deleted)

you can use foreground deletion

We tried that, it does not work.

Here is the dependencies graph:

owner --owns--> service instance --owns--> service binding

If you delete the owner forcing foreground deletion, k8s would try to delete the service instace and that would fail as a service binding exists.

We tried making the owner own the binding as well (over an owner reference):


owner --owns--> service instance --owns--> service binding
     \__owns_____________________________/

But then deleting the owner does not delete the binding, as an owner to the binding still exists. Keep in mind that the ownership from the instance to the binding is set by the BTP controller and is out of our control.

I tested it and it works.
your script has 2 issues, first you are setting an owner for the binding, this should not be done since the operator does it, second you are not using --cascade=foreground
try running this script

#!/bin/bash

set -euo pipefail

kubectl apply -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: test
---
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: test
  name: owner
EOF

OWNER_UID="$(kubectl get -n test configmaps owner -o=jsonpath='{.metadata.uid}')"

kubectl apply -f - <<EOF
apiVersion: services.cloud.sap.com/v1
kind: ServiceInstance
metadata:
  namespace: test
  name: my-service-instance-1
  ownerReferences:
  - apiVersion: v1
    kind: ConfigMap
    name: owner
    uid: $OWNER_UID
spec:
  serviceOfferingName: service-manager
  servicePlanName: subaccount-audit
---
apiVersion: services.cloud.sap.com/v1
kind: ServiceBinding
metadata:
  namespace: test
  name: my-service-binding-1
spec:
  secretName: my-secret
  serviceInstanceName: my-service-instance-1
EOF

kubectl wait --namespace test serviceinstances.services.cloud.sap.com/my-service-instance-1 --for=condition=ready
kubectl wait --namespace test servicebindings.services.cloud.sap.com/my-service-binding-1 --for=condition=ready

kubectl delete --namespace test configmaps owner --cascade=foreground

We tested as well and it worked with your version of the script. I am certain that we tried this before, but we could have missed something. This solution is not ideal for us but I guess it works. Closing for now.