kubernetes / kubernetes

Production-Grade Container Scheduling and Management

Home Page:https://kubernetes.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Readiness probe stops too early at eviction

hshiina opened this issue · comments

What happened?

When kubelet evicts a pod, the ready condition doesn’t get NotReady during the pod termination even if a readinessProbe is configured.

What did you expect to happen?

A readiness probe works during a pod termination so that the pod gets NotReady as early as possible.

How can we reproduce it (as minimally and precisely as possible)?

Use this readiness.yaml:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: readiness-script-configmap
data:
  readiness-script.sh: |
    #!/bin/sh
    handler() {
      rm /tmp/ready
      sleep 20
    }
    touch /tmp/ready
    trap handler SIGTERM
    while true; do
      sleep 1
    done
---
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: readiness-container
  name: readiness-test
spec:
  containers:
  - command:
    - sh
    - /script/readiness-script.sh
    image: busybox
    name: readiness
    readinessProbe:
      exec:
        command:
        - cat
        - /tmp/ready
      initialDelaySeconds: 3
      periodSeconds: 3
    volumeMounts:
    - name: readiness-script
      mountPath: /script
  - command:
    - sh
    - -c
    - for i in `seq 100`; do dd if=/dev/random of=file${i} bs=1048576 count=1 2>/dev/null; sleep .1; done; while true; do sleep 5; done
    name: disk-consumer
    image: busybox
    resources:
      limits:
        ephemeral-storage: "100Mi"
      requests:
        ephemeral-storage: "100Mi"
  volumes:
  - name: readiness-script
    configMap:
      name: readiness-script-configmap

Create resources and wait an eviction happens:

$ kubectl create -f readiness.yaml; kubectl get pods readiness-test -w
configmap/readiness-script-configmap created
pod/readiness-test created
NAME         	READY   STATUS          	RESTARTS   AGE
readiness-test   0/2 	ContainerCreating   0      	0s
readiness-test   1/2 	Running         	0      	3s
readiness-test   2/2 	Running         	0      	7s
readiness-test   0/2 	Error           	0      	46s
readiness-test   0/2 	Error           	0      	47s

When deleting this pod, the readiness probe works during termination:

$ kubectl create -f readiness.yaml; (sleep 15; kubectl delete pod readiness-test) & kubectl get pods -w
configmap/readiness-script-configmap created
pod/readiness-test created
[1] 137999
NAME             READY   STATUS              RESTARTS   AGE
readiness-test   0/2     ContainerCreating   0          0s
readiness-test   1/2     Running             0          2s
readiness-test   2/2     Running             0          6s
pod "readiness-test" deleted
readiness-test   2/2     Terminating         0          15s
readiness-test   1/2     Terminating         0          21s
readiness-test   0/2     Terminating         0          45s
readiness-test   0/2     Terminating         0          45s
readiness-test   0/2     Terminating         0          45s
readiness-test   0/2     Terminating         0          45s

Anything else we need to know?

I guess this issue is caused as follows:

At eviction, a pod phase is set to PodFailed internally in podStatusFn before stopping containers in the pod:

if podStatusFn != nil {
podStatusFn(&apiPodStatus)
}
kl.statusManager.SetPodStatus(pod, apiPodStatus)

err := m.killPodFunc(pod, true, &gracePeriodOverride, func(status *v1.PodStatus) {
status.Phase = v1.PodFailed
status.Reason = Reason
status.Message = evictMsg

Because the internal pod phase is PodFailed, the probe worker finishes working without probing containers at termination:

status, ok := w.probeManager.statusManager.GetPodStatus(w.pod.UID)
if !ok {
// Either the pod has not been created yet, or it was already deleted.
klog.V(3).InfoS("No status for pod", "pod", klog.KObj(w.pod))
return true
}
// Worker should terminate if pod is terminated.
if status.Phase == v1.PodFailed || status.Phase == v1.PodSucceeded {
klog.V(3).InfoS("Pod is terminated, exiting probe worker",
"pod", klog.KObj(w.pod), "phase", status.Phase)
return false
}

Kubernetes version

$ kubectl version
# paste output here

Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2

Cloud provider

None. I reproduced on my laptop.

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

kind

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/sig node