Kill unresponsive workers
asilvas opened this issue · comments
A recent situation came up (related to very complex regular expressions) that resulted in the worker hanging. Scenarios like this cannot be auto-fixed by the worker/app itself, as the event loop is dead in the water. No other processing will take place when this happens.
This would be a great addition to cservice, periodic heartbeats. If no response after X seconds, force terminate (and restart) the worker process.
Better yet, instead of a pulling, enable auto pushing of heartbeats from the worker. Anyone that doesn't report back within the threshold will be killed.