Delayed Jobs with suspend
mindrunner opened this issue · comments
Description
For testing purpose, I have one of my services running on my local computer which suspends over night.
There is only one cron job which is supposed to run early afternoon:
HARVEST_INTERVAL='0 13 * * *'
Jobs are currently executed around 10pm-11pm. Cannot find a pattern actually. I suspect suspend is somehow the issue, but idk why... Time in container is always correct.
Application is dockerized and lives in a docker-compose
setup if that matters. TZ
env is set correctly and generally everything is working fine (same app does not show that behavior on a 24/7 running server.
Any ideas?
Screenshots
No response
Additional information
No response
your computer suspends on a schedule? does it automatically un-suspend? my guess is the node-cron
function that gets the next time is being confused by your computer's suspension schedule.
here's what the getNextDate
function does:
/**
* Get next date matching the specified cron time.
*
* Algorithm:
* - Start with a start date and a parsed crontime.
* - Loop until 5 seconds have passed, or we found the next date.
* - Within the loop:
* - If it took longer than 5 seconds to select a date, throw an exception.
* - Find the next month to run at.
* - Find the next day of the month to run at.
* - Find the next day of the week to run at.
* - Find the next hour to run at.
* - Find the next minute to run at.
* - Find the next second to run at.
* - Check that the chosen time does not equal the current execution.
* - Return the selected date object.
*/
if you can log out the results of getNextDate
every few minutes before and after your computer sleeps that might help narrow down the problem
I suspend my computer when I go to sleep and i wake the computer up when I start working next morning. I will have a look at your suggestion and get back here if I am able to better reproduce it.
I created a minimal-reproducable example with both node-cron
and croner
(comment asking for that by @Hexagon was deleted apparently). Bot libraries show exactly the same behaviour.
import { CronJob } from 'cron'
import { Cron } from 'croner'
const trigger = (job: string): void => {
logger.info(`${job} triggered at ${dayjs().toISOString()}`)
}
const cronJobTest = (): void => {
trigger('cron ')
}
const cronTest = (): void => {
trigger('croner')
}
export const start = (): void => {
cronJob = new CronJob(config.cron.testInterval, cronJobTest, null, true)
cron = new Cron(config.cron.testInterval, cronTest)
}
config.cron.testInterval
comes from an environment:
TEST_INTERVAL='0 13 * * *'
Dockerized into an image based on node:18-alpine
and run with compose:
version: '3.5'
services:
app:
build:
context: /home/le/src/cron-test
dockerfile: ./docker/Dockerfile
restart: always
environment:
TZ: Europe/Berlin
env_file:
- .env
command: [ "run.sh" ]
Wait couple of days with computer going to sleep over night more or less and then checking the log:
❯ docker logs -t cron-test-app-1
2023-06-21T07:34:50.008929319Z info: Starting cron-test...
2023-06-21T11:00:00.095505537Z info: cron triggered at 2023-06-21T11:00:00.095Z
2023-06-21T11:00:00.096350129Z info: croner triggered at 2023-06-21T11:00:00.095Z
2023-06-22T19:56:37.466788580Z info: cron triggered at 2023-06-22T19:56:37.466Z
2023-06-22T19:56:37.467895614Z info: croner triggered at 2023-06-22T19:56:37.467Z
2023-06-23T21:01:22.030012103Z info: cron triggered at 2023-06-23T21:01:22.028Z
2023-06-23T21:01:22.032067341Z info: croner triggered at 2023-06-23T21:01:22.031Z
Note: 11am UTC is 1pm Europe/Berlin. So the first trigger is on point, 2nd and 3rd are 8
and 10
hours late (which might have been the time the computer was asleep, but I am not sure about that.
Thanks for great research @mindrunner 💯
I think the problem is that setTimeout (used internally by node-cron and other similar libraries) is paused when the computer (and/or container) is suspended, so if next run is 10 000 000 ms away and the computer is suspended 8 000 000 ms, the function will be run 8 000 000 ms too late. Need to test this with and without container to see if it's a docker issue or general issue.
Docker or not, the solution would be to chop the waiting into smaller chunks, and if the chunks are small enough, there will be no problem when the computer is suspended.
( The node-cron-folks do not like it when I comment their issues (citing an email i got) "advertising competing libraries". Thats why my question got deleted. To whom it may concern - sorry, will try to avoid that. )