TaskCo should have graceful shutdown
brandoncarl opened this issue · comments
Many PaaS providers (Heroku being a notable example), kill processes periodically. We need to respond to these deaths by re-queueing jobs.
TaskCo should not listen for these events, but rather, offer a shutdown
method at each stage:
Company shutdown > factory shutdown > dispatcher shutdown > team shutdown > worker shutdown
Currently in progress of working on this.
There have been two primary difficulties:
- BLPOP with no timeout makes it impossible to gracefully close down Redis connections.
- Difficulties shutting down code already running. This is necessary when a termination event is received.
Nearly done with the following implementation:
- Affected Factories are told to commence shutdown process.
- Dispatcher halts retrieving next tasks.
- Teams are told to commence shutdown process.
- Teams log active tasks into
purgatory
, along with timestamps. - Dispatcher shuts down and broadcasts termination.
- Pooled connections are shut down.
- New or sibling processes parse through purgatory to find tasks w/action needed.