Allow setting .spec.startingDeadlineSeconds in the helm cronjob
SerialVelocity opened this issue · comments
Is your feature request related to a problem? Please describe.
If the cluster is experiencing problems or the job fails to start enough times, you end up with:
Warning FailedNeedsStart 3m47s (x25349 over 3d2h) cronjob-controller Cannot determine if job needs to be started: too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew
Turns out my backups haven't been running for 73 days... and since there weren't failing jobs anymore, no alert.
kubernetes/kubernetes#42649 is related.
Describe the solution you'd like
Allow setting .spec.startingDeadlineSeconds from the helm cronjob.
Should we set spec.startingDeadlineSeconds
by default, too? I can't deduce from the Kubernetes documentation what not setting spec.startingDeadlineSeconds
implies.
If you don't set it and the job fails to start 100 times, the job is permanently disabled until it is deleted and recreated.
It's hard to choose a sane default since if someone wants to backup their data every 5 minutes, you want startingDeadlineSeconds to be under 5 minutes. If they backup every hour, you can set this to 30 mins, etc (you still need to keep the amount of fails under 100 times though)
Thank you for the explanation. I came to the same conclusion while adding the setting to my own deployments of Benji. There is no default that fits all use case. I'm going to close this issue now.