Slim Serverless Computing Job Queue
WORK IN PROGRESS
See also Quick_Queue in github.com://andrasq/quicklib/
Structure
Job
{ id: string,
type: string,
dt: Date,
lock: string,
data: string|binary|null }
id
globally unique job id, assigned when job is added. Encodes the ingestion time and daemon that received the job.type
job type identifies the procedure that will run the jobdt
timestamp, used for schedulinglock
owner, used for schedulingdata
job payload, a newline terminated byte string
Date and Lock
The job timestamp dt
and and owner lock
encode the job disposition: deferred, ready,
running, abandoned, completed.
| dt | lock | state |
| > now | ''
| deferred: job not yet eligible to be run |
| < now | ''
| ready: job waiting to be run |
| > now | sysid | running: job is being run by daemon sysid |
| < now | sysid | abandoned: daemon sysid has stalled or crashed |
| > 3000 | '__done'
| done: job completed at dt - 1000 * YEARS
, waiting to be purged |
Journal
Jobs are added by type
and payload
, get tagged with a system-wide unique id
and are
persisted to a fast local journal as |
bar separated, newline terminated id|type|payload\n
strings. The journal is consumed and saved to the job store asynchronously. Each job is
guaranteed to be saved at least once (in case of error it's possible that a job may get
re-saved. The store may, but is not required to, de-dup by id.)
Store
The store holds jobs to be run. Running jobs are claimed (locked) by a damon. Locks are kept refreshed until they complete. An expired lock is a sign that the daemon stalled or crashed. The store also provides information about the job types waiting to be run, to assist scheduling. Jobs are added to and removed from the store as objects, no longer Journal strings.
Runner
The runner instantiates the procedures to process the job types, and feeds the procedure the job payloads. Jobs that crash are retried, otherwise the exit status is just logged. Each job is guaranteed to be run at least once (possibly more than once if the job or the runner crashes). Results are not propagated because jobs run asynchronously and there is no agent to deliver results to. Jobs can submit other jobs, though, and can thus deliver results themselves.
API
The app has just a few http endpoints:
GET /start
- start or resume the queueGET /stop
- suspend the queueGET /quit
- close and exit the queuePOST /add?jobtype=&client=
- add jobs to the queue. Each newline-terminated line in the post body will be a payload passed to a job of typejobtype
.
Configuration
Edit ./config/defaults.js
to contain the shared defaults, with production
settings merged in from ./config/${NODE_ENV}.js
. A local development environment can
change select configs with ./config/local.js
. The default NODE_ENV is development
.