godaddy / node-cluster-service

Turn your single process code into a fault-resilient, multi-process service with built-in REST & CLI support. Restart or hot upgrade your web servers with zero downtime or impact to clients.

Home Page:https://www.npmjs.org/package/cluster-service

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kill unresponsive workers

asilvas opened this issue · comments

A recent situation came up (related to very complex regular expressions) that resulted in the worker hanging. Scenarios like this cannot be auto-fixed by the worker/app itself, as the event loop is dead in the water. No other processing will take place when this happens.

This would be a great addition to cservice, periodic heartbeats. If no response after X seconds, force terminate (and restart) the worker process.

Better yet, instead of a pulling, enable auto pushing of heartbeats from the worker. Anyone that doesn't report back within the threshold will be killed.