operator-sysctl daemonset not cleaned up when operator removed

Question

operator-sysctl daemonset not cleaned up when operator removed

hawksight opened this issue 6 years ago · comments

When removing all resources related to the example deployment on GKE (and i suspect other backends), the operator does not clean up the elasticsearch-operator-sysctl daemonset and pods.

For example in a small two node cluster after deletion, this was my state:

# k get ns
NAME          STATUS    AGE
default       Active    3h
kube-public   Active    3h
kube-system   Active    3h

# k get pods
NAME                                  READY     STATUS        RESTARTS   AGE
elasticsearch-operator-sysctl-lgb8r   1/1       Running       0          2h
elasticsearch-operator-sysctl-lsvtc   1/1       Running       0          2h

# k describe pod elasticsearch-operator-sysctl-lgb8r
Name:           elasticsearch-operator-sysctl-lgb8r
Namespace:      default
Node:           gke-vpc-du-es-vpc-du-es-np-2-87dc503d-1k16/10.10.0.3
Start Time:     Thu, 22 Nov 2018 13:40:49 +0000
Labels:         controller-revision-hash=4034564464
                k8s-app=elasticsearch-operator
                pod-template-generation=1
Annotations:    <none>
Status:         Running
IP:             10.60.2.5
Controlled By:  DaemonSet/elasticsearch-operator-sysctl
Containers:
  sysctl-conf:
    Container ID:  docker://261e2940d1ca81267aa24d4c3a9c1f1c31b1b542620bad561f681cae1436001c
    Image:         busybox:1.26.2
    Image ID:      docker-pullable://busybox@sha256:be3c11fdba7cfe299214e46edc642e09514dbb9bbefcd0d3836c05a1e0cd0642
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      sysctl -w vm.max_map_count=262166 && while true; do sleep 86400; done
    State:          Running
      Started:      Thu, 22 Nov 2018 13:40:53 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     10m
      memory:  50Mi
    Requests:
      cpu:        10m
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z2sbn (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          True
  PodScheduled   True
Volumes:
  default-token-z2sbn:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-z2sbn
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
Events:          <none>

# k get daemonset
NAME                            DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
elasticsearch-operator-sysctl   2         2         2         2            2           <none>          2h

# k get customresourcedefinition
NAME                                         AGE
elasticsearchclusters.enterprises.upmc.com   2h

Also to note that the customresourcedefinition = elasticsearchclusters.enterprises.upmc.com is also left over. I would have assumed that when the operator is removed, that it attempts to clean those up?

Im wondering if this is potentially related to the fact the 0.2.0 still deploys the sysctl daemonset to the default namespace as opposed to the operator's namespace?

I think this issue is potentially the cause of #165

Having read #198 #220 and #231 I can see that the namespace for sysctl daemonset is configurable, so I will attempt moving it and see if I get the same result.

I have noticed this in the pod yaml though (blockOwnerDeletion):

  ownerReferences:
  - apiVersion: extensions/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: elasticsearch-operator-sysctl
    uid: 38e83bed-ee5c-11e8-b3f2-42010a800067

Would that prevent these pods ever being cleaned up when removing the operator?