[BUG] When node is lost, it's pods can't recover

Question

[BUG] When node is lost, it's pods can't recover

SlavikCA opened this issue 4 months ago · comments

Slavik commented 4 months ago

To Reproduce
Steps to reproduce the behavior:

I'm running Harvester 1.3.0 on 2 management/worker nodes: T7920 and T7820i (Xeon 16 cores, 64GB+ RAM each). And I have witness node.
Cordone one node (T7820i in my case)
Shutdown node (T7820i via shutdown now in CLI)

Expected behavior
I expect, that pods which were running on T7820i now will run on another node: T7920

Actually what happens
Pod are not running on another node.

For example, I see that I had Gitlab deployment running on T7820i.
The pod uses storage, which is few PVC on Longhorn. Every PV has 2 replicas:

And when the node was shutdown here is what I see:

NAMESPACE     NAME                                 READY   STATUS             RESTARTS         AGE     IP            NODE      NOMINATED NODE   READINESS GATES
gitlab         gitlab-web-7d48cc6d59-l8vbv          1/1     Terminating         0                155m    10.52.4.56    t7820i    <none>           <none>

It stuck in the Terminating state.

At the same time, on another node I see this:

NAMESPACE     NAME                                 READY   STATUS             RESTARTS         AGE     IP            NODE      NOMINATED NODE   READINESS GATES
gitlab        gitlab-web-7d48cc6d59-g42tc          0/1     ContainerCreating   0                8m3s    <none>        t7920     <none>           <none>

k describe pod -n gitlab                            gitlab-web-7d48cc6d59-g42tc

Name:             gitlab-web-7d48cc6d59-g42tc
Namespace:        gitlab
Priority:         0
Service Account:  default
Node:             t7920/10.0.4.144
Start Time:       Sat, 18 May 2024 01:15:25 +0000
Labels:           app=gitlab-web
                  pod-template-hash=7d48cc6d59
Annotations:      <none>
Status:           Pending
IP:
IPs:              <none>
Controlled By:    ReplicaSet/gitlab-web-7d48cc6d59
Containers:
  gitlab-web:
    Container ID:
    Image:          gitlab/gitlab-ce:16.11.2-ce.0
    Image ID:
    Ports:          80/TCP, 22/TCP, 9999/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     3
      memory:  12Gi
    Requests:
      cpu:     500m
      memory:  6Gi
    Environment:
      GITLAB_OMNIBUS_CONFIG:  <set to the key 'gitlab.rb' of config map 'gitlab-config'>  Optional: false
    Mounts:
      /etc/gitlab from gitlab-config (rw)
      /mnt/dependency_proxy from gitlab-containers (rw)
      /var/log/gitlab from gitlab-logs (rw)
      /var/opt/gitlab from gitlab-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9m42l (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  gitlab-config:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  gitlab-config-pvc
    ReadOnly:   false
  gitlab-logs:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  gitlab-logs-pvc
    ReadOnly:   false
  gitlab-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  gitlab-data-pvc
    ReadOnly:   false
  gitlab-containers:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  gitlab-containers-pvc
    ReadOnly:   false
  kube-api-access-9m42l:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason              Age    From                     Message
  ----     ------              ----   ----                     -------
  Normal   Scheduled           8m49s  default-scheduler        Successfully assigned gitlab/gitlab-web-7d48cc6d59-g42tc to t7920
  Warning  FailedAttachVolume  8m49s  attachdetach-controller  Multi-Attach error for volume "pvc-b1a2f5c1-8057-4ac7-8e54-3e2e3eeb42ca" Volume is already used by pod(s) gitlab-web-7d48cc6d59-l8vbv
  Warning  FailedAttachVolume  8m49s  attachdetach-controller  Multi-Attach error for volume "pvc-9bd25569-8138-45df-a942-f9751a534762" Volume is already used by pod(s) gitlab-web-7d48cc6d59-l8vbv
  Warning  FailedAttachVolume  8m49s  attachdetach-controller  Multi-Attach error for volume "pvc-cb3abd40-4228-4206-9d14-5d0b4546bcc3" Volume is already used by pod(s) gitlab-web-7d48cc6d59-l8vbv
  Warning  FailedAttachVolume  8m49s  attachdetach-controller  Multi-Attach error for volume "pvc-0881334c-3edf-4302-b4c1-7bb615dcbc38" Volume is already used by pod(s) gitlab-web-7d48cc6d59-l8vbv
  Warning  FailedMount         6m46s  kubelet                  Unable to attach or mount volumes: unmounted volumes=[gitlab-config gitlab-logs gitlab-data gitlab-containers], unattached volumes=[gitlab-data gitlab-config gitlab-logs gitlab-containers], failed to process volumes=[]: timed out waiting for the condition
  Warning  FailedMount         4m28s  kubelet                  Unable to attach or mount volumes: unmounted volumes=[gitlab-data gitlab-containers gitlab-config gitlab-logs], unattached volumes=[gitlab-logs gitlab-containers gitlab-data gitlab-config], failed to process volumes=[]: timed out waiting for the condition
  Warning  FailedMount         2m13s  kubelet                  Unable to attach or mount volumes: unmounted volumes=[gitlab-logs gitlab-data gitlab-containers gitlab-config], unattached volumes=[gitlab-data gitlab-config gitlab-logs gitlab-containers], failed to process volumes=[]: timed out waiting for the condition

So, I have 2 nodes, replicated storage, but when node is lost (shutdown) - the app is down. What can I do make the app (Gitlab) resilient in case on node failure?

In the steps above, I cordoned one node. But the promise of Kubernetes is that even in case on unexpected node failure I still would have the app continue to work. Why that's not the case? Is it the problem in Storage layer (Longhorn)?

Slavik · Answer 1 · Sat May 18 2024 12:39:00 GMT+0800 (China Standard Time)

Theoretically, draining the node should help.
However, draining doesn't move pods from cordoned node:

kubectl drain t7820i --ignore-daemonsets
node/t7820i already cordoned

error: unable to drain node "t7820i" due to error:
[cannot delete Pods with local storage (use --delete-emptydir-data to override): 
cattle-fleet-system/fleet-controller-6fc8c65685-qzh4j, 
cattle-logging-system/harvester-default-event-tailer-0, 
cattle-logging-system/rancher-logging-kube-audit-fluentd-0, 
cattle-logging-system/rancher-logging-root-fluentd-0, 
cattle-monitoring-system/alertmanager-rancher-monitoring-alertmanager-0, 
cattle-monitoring-system/prometheus-rancher-monitoring-prometheus-0, 
cattle-monitoring-system/rancher-monitoring-grafana-d6f466988-2c927, 
cattle-monitoring-system/rancher-monitoring-prometheus-adapter-55dc9ccd5d-dvgxs, 
cattle-system/system-upgrade-controller-78cfb99bb7-n79jg, 
harvester-system/virt-api-77cbf85485-gvscr, 
harvester-system/virt-api-77cbf85485-sllm5, 
harvester-system/virt-controller-659ccbfbcd-7jqpj, 
harvester-system/virt-controller-659ccbfbcd-bxmqr, 
harvester-system/virt-operator-6b8b9b7578-6nlbf, 
kube-system/rke2-metrics-server-5c9768ff67-nr9bz, 
longhorn-system/longhorn-ui-7f8cdfcc48-bzhqj, 
longhorn-system/longhorn-ui-7f8cdfcc48-zvkz9, 
traefik/traefik-9d55867c7-gmlqf, 
cannot delete Pods declare no controller (use --force to override): dsm/dsm], continuing command...

It's interesting, that gitlab is not in the list of "problematic" pods - it doesn't have local storage. But it still not migrated / not restarted

freeze · Answer 2 · Mon May 20 2024 23:57:24 GMT+0800 (China Standard Time)

Hi @SlavikCA,

Did the gitlab pod run on the Harvester cluster or the downstream cluster?
Also, what are the PVC access modes? It looks like it is stuck because the volume can not attach.

Or you can generate a support bundle for investigation?

Slavik · Answer 3 · Tue May 21 2024 00:02:58 GMT+0800 (China Standard Time)

Everything is running on Harvester cluster. I don't have any downstream cluster.

Every PVC is ReadWriteOnce, defined similarly to this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: gitlab-data-pvc
  namespace: gitlab
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ssd-2r
  resources:
    requests:
      storage: 50Gi

ssd-2r defined as 2 replicas on SSD disks

You're correct, that the issue is that the volume can not attach. But why the volume can not attach?
It looks like the volume can not attach, because it still considered attached to pods, which were running on the node, which now offline. These pods shown as Terminating, but they keep the storage attached to them.

freeze · Answer 4 · Wed May 22 2024 00:25:44 GMT+0800 (China Standard Time)

Hi @SlavikCA,
That's an experimental feature. Maybe there is something wrong.
Could you generate the support bundle for investigation?

Slavik · Answer 5 · Wed May 22 2024 01:54:48 GMT+0800 (China Standard Time)

@Vicente-Cheng
Here is the support bundle:
https://s3.fursov.family/shares/supportbundle_06b15f17-7231-42ca-9fbe-58f9e910f3b6_2024-05-19T03-04-58Z.zip

Can you please clarify: which specific feature is experimental?