[BUG] When node is lost, it's pods can't recover
SlavikCA opened this issue · comments
To Reproduce
Steps to reproduce the behavior:
- I'm running Harvester 1.3.0 on 2 management/worker nodes: T7920 and T7820i (Xeon 16 cores, 64GB+ RAM each). And I have witness node.
- Cordone one node (T7820i in my case)
- Shutdown node (T7820i via
shutdown now
in CLI)
Expected behavior
I expect, that pods which were running on T7820i now will run on another node: T7920
Actually what happens
Pod are not running on another node.
For example, I see that I had Gitlab deployment running on T7820i.
The pod uses storage, which is few PVC on Longhorn. Every PV has 2 replicas:
And when the node was shutdown here is what I see:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
gitlab gitlab-web-7d48cc6d59-l8vbv 1/1 Terminating 0 155m 10.52.4.56 t7820i <none> <none>
It stuck in the Terminating
state.
At the same time, on another node I see this:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
gitlab gitlab-web-7d48cc6d59-g42tc 0/1 ContainerCreating 0 8m3s <none> t7920 <none> <none>
k describe pod -n gitlab gitlab-web-7d48cc6d59-g42tc
Name: gitlab-web-7d48cc6d59-g42tc
Namespace: gitlab
Priority: 0
Service Account: default
Node: t7920/10.0.4.144
Start Time: Sat, 18 May 2024 01:15:25 +0000
Labels: app=gitlab-web
pod-template-hash=7d48cc6d59
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/gitlab-web-7d48cc6d59
Containers:
gitlab-web:
Container ID:
Image: gitlab/gitlab-ce:16.11.2-ce.0
Image ID:
Ports: 80/TCP, 22/TCP, 9999/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
cpu: 3
memory: 12Gi
Requests:
cpu: 500m
memory: 6Gi
Environment:
GITLAB_OMNIBUS_CONFIG: <set to the key 'gitlab.rb' of config map 'gitlab-config'> Optional: false
Mounts:
/etc/gitlab from gitlab-config (rw)
/mnt/dependency_proxy from gitlab-containers (rw)
/var/log/gitlab from gitlab-logs (rw)
/var/opt/gitlab from gitlab-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9m42l (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
gitlab-config:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: gitlab-config-pvc
ReadOnly: false
gitlab-logs:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: gitlab-logs-pvc
ReadOnly: false
gitlab-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: gitlab-data-pvc
ReadOnly: false
gitlab-containers:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: gitlab-containers-pvc
ReadOnly: false
kube-api-access-9m42l:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m49s default-scheduler Successfully assigned gitlab/gitlab-web-7d48cc6d59-g42tc to t7920
Warning FailedAttachVolume 8m49s attachdetach-controller Multi-Attach error for volume "pvc-b1a2f5c1-8057-4ac7-8e54-3e2e3eeb42ca" Volume is already used by pod(s) gitlab-web-7d48cc6d59-l8vbv
Warning FailedAttachVolume 8m49s attachdetach-controller Multi-Attach error for volume "pvc-9bd25569-8138-45df-a942-f9751a534762" Volume is already used by pod(s) gitlab-web-7d48cc6d59-l8vbv
Warning FailedAttachVolume 8m49s attachdetach-controller Multi-Attach error for volume "pvc-cb3abd40-4228-4206-9d14-5d0b4546bcc3" Volume is already used by pod(s) gitlab-web-7d48cc6d59-l8vbv
Warning FailedAttachVolume 8m49s attachdetach-controller Multi-Attach error for volume "pvc-0881334c-3edf-4302-b4c1-7bb615dcbc38" Volume is already used by pod(s) gitlab-web-7d48cc6d59-l8vbv
Warning FailedMount 6m46s kubelet Unable to attach or mount volumes: unmounted volumes=[gitlab-config gitlab-logs gitlab-data gitlab-containers], unattached volumes=[gitlab-data gitlab-config gitlab-logs gitlab-containers], failed to process volumes=[]: timed out waiting for the condition
Warning FailedMount 4m28s kubelet Unable to attach or mount volumes: unmounted volumes=[gitlab-data gitlab-containers gitlab-config gitlab-logs], unattached volumes=[gitlab-logs gitlab-containers gitlab-data gitlab-config], failed to process volumes=[]: timed out waiting for the condition
Warning FailedMount 2m13s kubelet Unable to attach or mount volumes: unmounted volumes=[gitlab-logs gitlab-data gitlab-containers gitlab-config], unattached volumes=[gitlab-data gitlab-config gitlab-logs gitlab-containers], failed to process volumes=[]: timed out waiting for the condition
So, I have 2 nodes, replicated storage, but when node is lost (shutdown) - the app is down. What can I do make the app (Gitlab) resilient in case on node failure?
In the steps above, I cordoned one node. But the promise of Kubernetes is that even in case on unexpected node failure I still would have the app continue to work. Why that's not the case? Is it the problem in Storage layer (Longhorn)?
Theoretically, draining the node should help.
However, draining doesn't move pods from cordoned node:
kubectl drain t7820i --ignore-daemonsets
node/t7820i already cordoned
error: unable to drain node "t7820i" due to error:
[cannot delete Pods with local storage (use --delete-emptydir-data to override):
cattle-fleet-system/fleet-controller-6fc8c65685-qzh4j,
cattle-logging-system/harvester-default-event-tailer-0,
cattle-logging-system/rancher-logging-kube-audit-fluentd-0,
cattle-logging-system/rancher-logging-root-fluentd-0,
cattle-monitoring-system/alertmanager-rancher-monitoring-alertmanager-0,
cattle-monitoring-system/prometheus-rancher-monitoring-prometheus-0,
cattle-monitoring-system/rancher-monitoring-grafana-d6f466988-2c927,
cattle-monitoring-system/rancher-monitoring-prometheus-adapter-55dc9ccd5d-dvgxs,
cattle-system/system-upgrade-controller-78cfb99bb7-n79jg,
harvester-system/virt-api-77cbf85485-gvscr,
harvester-system/virt-api-77cbf85485-sllm5,
harvester-system/virt-controller-659ccbfbcd-7jqpj,
harvester-system/virt-controller-659ccbfbcd-bxmqr,
harvester-system/virt-operator-6b8b9b7578-6nlbf,
kube-system/rke2-metrics-server-5c9768ff67-nr9bz,
longhorn-system/longhorn-ui-7f8cdfcc48-bzhqj,
longhorn-system/longhorn-ui-7f8cdfcc48-zvkz9,
traefik/traefik-9d55867c7-gmlqf,
cannot delete Pods declare no controller (use --force to override): dsm/dsm], continuing command...
It's interesting, that gitlab is not in the list of "problematic" pods - it doesn't have local storage. But it still not migrated / not restarted
Hi @SlavikCA,
Did the gitlab pod run on the Harvester cluster or the downstream cluster?
Also, what are the PVC access modes? It looks like it is stuck because the volume can not attach.
Or you can generate a support bundle for investigation?
Everything is running on Harvester cluster. I don't have any downstream cluster.
Every PVC is ReadWriteOnce, defined similarly to this:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gitlab-data-pvc
namespace: gitlab
spec:
accessModes:
- ReadWriteOnce
storageClassName: ssd-2r
resources:
requests:
storage: 50Gi
ssd-2r defined as 2 replicas on SSD disks
You're correct, that the issue is that the volume can not attach
. But why the volume can not attach?
It looks like the volume can not attach, because it still considered attached to pods, which were running on the node, which now offline. These pods shown as Terminating
, but they keep the storage attached to them.
Hi @SlavikCA,
That's an experimental feature. Maybe there is something wrong.
Could you generate the support bundle for investigation?
@Vicente-Cheng
Here is the support bundle:
https://s3.fursov.family/shares/supportbundle_06b15f17-7231-42ca-9fbe-58f9e910f3b6_2024-05-19T03-04-58Z.zip
Can you please clarify: which specific feature is experimental?