kubernetes-sigs / cluster-api

What steps did you take and what happened?

Create a CAPA cluster with at least one machine/node
Apply a machinehealthcheck that attempts to remediate machines when nodes stop reporting status

spec:
  maxUnhealthy: 2
  unhealthyConditions:
  - status: Unknown
    timeout: 8m0s
    type: Ready

Run a pod on the cluster that mounts a persistent volume
Stop the underlying EC2 instance in AWS

Observe that the DrainingSucceeded status condition on the machine reports status: "True" after the skipWaitForDelete timeout during the drain is exceeded (

cluster-api/internal/controllers/machine/machine_controller.go

Lines 672 to 675 in a2b7dd1

    
           if noderefutil.IsNodeUnreachable(node) { 
        
           	// When the node is unreachable and some pods are not evicted for as long as this timeout, we ignore them. 
        
           	drainer.SkipWaitForDeleteTimeoutSeconds = 60 * 5 // 5 minutes 
        
           }

)

The machine is then stuck in a deleting state forever because the volume is not detached

What did you expect to happen?

When a machinehealthcheck is attempting to remediate a machine when its underlying EC2 instance is stopped, I expect that it will successfully drain the node/replace the machine.

Cluster API version

1.7.1

Kubernetes version

v1.27.13+e709aa5

Anything else you would like to add?

I believe that we can address this by setting GracePeriodSeconds: 1 like OpenShift's machinehealthcheck controller:

OpenShift: https://github.com/openshift/machine-api-operator/blob/dcf1387cb69f8257345b2062cff79a6aefb1f5d9/pkg/controller/machine/drain_controller.go#L164-L171

CAPI:

cluster-api/internal/controllers/machine/machine_controller.go

Lines 672 to 675 in a2b7dd1

    
           if noderefutil.IsNodeUnreachable(node) { 
        
           	// When the node is unreachable and some pods are not evicted for as long as this timeout, we ignore them. 
        
           	drainer.SkipWaitForDeleteTimeoutSeconds = 60 * 5 // 5 minutes 
        
           }

because for unreachable nodes, deleting pods with a specified grace period will allow for successful volume detachment.

Label(s) to be applied

/kind bug
/area machine

/triage accepted

/assign

	if noderefutil.IsNodeUnreachable(node) {
	// When the node is unreachable and some pods are not evicted for as long as this timeout, we ignore them.
	drainer.SkipWaitForDeleteTimeoutSeconds = 60 * 5 // 5 minutes
	}