elemental-lf / benji

Benji Backup: A block based deduplicating backup software for Ceph RBD images, iSCSI targets, image files and block devices

Home Page:https://benji-backup.me

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow setting .spec.startingDeadlineSeconds in the helm cronjob

SerialVelocity opened this issue · comments

Is your feature request related to a problem? Please describe.
If the cluster is experiencing problems or the job fails to start enough times, you end up with:

  Warning  FailedNeedsStart  3m47s (x25349 over 3d2h)  cronjob-controller  Cannot determine if job needs to be started: too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew

Turns out my backups haven't been running for 73 days... and since there weren't failing jobs anymore, no alert.

kubernetes/kubernetes#42649 is related.

Describe the solution you'd like
Allow setting .spec.startingDeadlineSeconds from the helm cronjob.

Should we set spec.startingDeadlineSeconds by default, too? I can't deduce from the Kubernetes documentation what not setting spec.startingDeadlineSeconds implies.

If you don't set it and the job fails to start 100 times, the job is permanently disabled until it is deleted and recreated.

It's hard to choose a sane default since if someone wants to backup their data every 5 minutes, you want startingDeadlineSeconds to be under 5 minutes. If they backup every hour, you can set this to 30 mins, etc (you still need to keep the amount of fails under 100 times though)

Thank you for the explanation. I came to the same conclusion while adding the setting to my own deployments of Benji. There is no default that fits all use case. I'm going to close this issue now.